Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
c20168c
wip: Probably working right.
yhakbar Feb 21, 2025
a521804
fix: Cleaning things up a bit
yhakbar Feb 24, 2025
cb731e6
fix: Getting rid of unused constant
yhakbar Feb 24, 2025
1a54f04
fix: Fixing remote reference optimization
yhakbar Feb 24, 2025
6d70093
fix: More performance improvements
yhakbar Feb 24, 2025
0b592ea
fix: Performance testing done.
yhakbar Feb 24, 2025
52a9cd1
fix: Tossing some unnecessary comments
yhakbar Feb 24, 2025
4802efe
feat: Attempting implementation
yhakbar Feb 24, 2025
5629b2c
fix: Fixing recursive storage of trees
yhakbar Feb 24, 2025
9264dc3
feat: Upgrade to `go-getter` v2
yhakbar Feb 25, 2025
15d4fa3
feat: Use go-getter for cln protocol
yhakbar Feb 25, 2025
629f98b
fix: Removing debug logging
yhakbar Feb 25, 2025
926fc74
fix: Using an early return
yhakbar Feb 25, 2025
add47f5
fix: Some updates to how scaffolding gets done after catalog
yhakbar Feb 25, 2025
67ce742
fix: Updating experiment name
yhakbar Feb 25, 2025
1f4704f
fix: Rename `clngo` to `cln`
yhakbar Feb 25, 2025
95f3919
fix: Linting
yhakbar Feb 25, 2025
28aad5a
fix: Revert changes to `catalog` command
yhakbar Feb 25, 2025
7ab6f05
fix: Refactoring clone so that it's easier to work with
yhakbar Feb 25, 2025
477e496
fix: Switch to explicit sentinel instead of relying on `.git`
yhakbar Feb 26, 2025
4674e3f
fix: Renaming `repo` to `url`
yhakbar Feb 26, 2025
1549810
fix: Fixing tree linking
yhakbar Feb 26, 2025
0ee6b48
fix: Mostly fixed
yhakbar Feb 26, 2025
23192d4
fix: Tests fixed
yhakbar Feb 26, 2025
46bf743
fix: Fully repaired cln
yhakbar Feb 26, 2025
172901f
feat: Concurrently linking the tree
yhakbar Feb 26, 2025
06c4fe3
feat: Reducing concurrency for linking
yhakbar Feb 27, 2025
09950d8
feat: Persisting select `.git` files
yhakbar Feb 27, 2025
b2650f2
feat: Integrating `cln` into catalog
yhakbar Feb 27, 2025
996d9ad
feat: Renaming cln to CAS
yhakbar Feb 27, 2025
28e4e69
feat: Adjusting store path
yhakbar Feb 27, 2025
ac38312
feat: Paritioning content
yhakbar Feb 27, 2025
acebb90
fix: Refactoring a bit
yhakbar Feb 27, 2025
d69785f
fix: Lock access to the map
yhakbar Feb 27, 2025
557290b
fix: Improving integration into catalog
yhakbar Feb 27, 2025
2196a3a
fix: Linting
yhakbar Feb 27, 2025
18b7990
feat: Adding experiment docs
yhakbar Feb 27, 2025
5f2ab42
feat: Adding CAS feature docs
yhakbar Feb 27, 2025
8a66be5
fix: Filling in placeholder link
yhakbar Feb 27, 2025
d0f60f7
fix: Addressing CodeRabbit feedback
yhakbar Feb 27, 2025
27daacb
fix: Use parent logger
yhakbar Feb 27, 2025
21996f6
fix: Linting errors
yhakbar Feb 27, 2025
8f9b5ce
fix: Working on optimization
yhakbar Mar 4, 2025
b3b4d6f
fix: Tests working
yhakbar Mar 4, 2025
5e692a5
fix: Cleanup
yhakbar Mar 4, 2025
97d7859
fix: Adding tmp timestamp check
yhakbar Mar 4, 2025
60e7b2b
fix: Remove unused constants
yhakbar Mar 4, 2025
4c53664
feat: Adding `LinkTree` test
yhakbar Mar 4, 2025
9b1a4fc
fix: Update implementation to test table
yhakbar Mar 4, 2025
456c68c
fix: Linting
yhakbar Mar 4, 2025
53d2566
fix: Deleting some dead code
yhakbar Mar 4, 2025
12fe862
fix: Fixing Windows test of hard linked tree
yhakbar Mar 4, 2025
adf258f
fix: Fixing unnecessary imports
yhakbar Mar 4, 2025
cb71191
feat: Bumping Catalog to use go-getter v2
yhakbar Mar 4, 2025
58f3ea3
wip: Adding go-getter
yhakbar Mar 4, 2025
3844890
fix: Adjusting abstraction so that clone gets options
yhakbar Mar 4, 2025
baa76b0
fix: Refactoring
yhakbar Mar 4, 2025
1e0805d
fix: Adding getter
yhakbar Mar 4, 2025
ebe538c
fix: Use go-getter for clones
yhakbar Mar 4, 2025
87617ad
fix: Updating docs for updated CAS implementation
yhakbar Mar 4, 2025
edd85b1
fix: Linting
yhakbar Mar 5, 2025
428d562
fix: Getting rid of command cache
yhakbar Mar 5, 2025
46b8000
fix: Fixing comment
yhakbar Mar 5, 2025
c6efc57
fix: Addressing CodeRabbit feedback
yhakbar Mar 5, 2025
f50e994
fix: Removing unnecessary lock
yhakbar Mar 5, 2025
49291b1
feat: Plumbing ctx
yhakbar Mar 5, 2025
5086658
feat: Addressing review feedback
yhakbar Mar 10, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 1 addition & 3 deletions cli/app_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -533,8 +533,6 @@ func (err argMissingValueError) Error() string {
}

func TestAutocomplete(t *testing.T) { //nolint:paralleltest
defer os.Unsetenv("COMP_LINE")

testCases := []struct {
compLine string
expectedCompletes []string
Expand All @@ -558,7 +556,7 @@ func TestAutocomplete(t *testing.T) { //nolint:paralleltest
}

for _, testCase := range testCases {
os.Setenv("COMP_LINE", "terragrunt "+testCase.compLine)
t.Setenv("COMP_LINE", "terragrunt "+testCase.compLine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


output := &bytes.Buffer{}
opts := options.NewTerragruntOptionsWithWriters(output, os.Stderr)
Expand Down
5 changes: 3 additions & 2 deletions cli/commands/catalog/action.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,12 @@ func Run(ctx context.Context, opts *options.TerragruntOptions, repoURL string) e
var modules module.Modules

walkWithSymlinks := opts.Experiments.Evaluate(experiment.Symlinks)
allowCAS := opts.Experiments.Evaluate(experiment.CAS)

for _, repoURL := range repoURLs {
tempDir := filepath.Join(os.TempDir(), fmt.Sprintf(tempDirFormat, util.EncodeBase64Sha1(repoURL)))
path := filepath.Join(os.TempDir(), fmt.Sprintf(tempDirFormat, util.EncodeBase64Sha1(repoURL)))

repo, err := module.NewRepo(ctx, opts.Logger, repoURL, tempDir, walkWithSymlinks)
repo, err := module.NewRepo(ctx, opts.Logger, repoURL, path, walkWithSymlinks, allowCAS)
if err != nil {
return err
}
Expand Down
2 changes: 1 addition & 1 deletion cli/commands/catalog/module/module.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ type Module struct {
url string
}

// NewModule returns a module instance if the given `moduleDir` path contains a Terraform module, otherwise returns nil.
// NewModule returns a module instance if the given `moduleDir` path contains an OpenTofu/Terraform module, otherwise returns nil.
func NewModule(repo *Repo, moduleDir string) (*Module, error) {
module := &Module{
Repo: repo,
Expand Down
174 changes: 138 additions & 36 deletions cli/commands/catalog/module/repo.go
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@ package module
import (
"context"
"fmt"
"net/url"
"os"
"path/filepath"
"regexp"
Expand All @@ -13,10 +12,11 @@ import (

"github.com/gitsight/go-vcsurl"
"github.com/gruntwork-io/go-commons/files"
"github.com/gruntwork-io/terragrunt/internal/cas"
"github.com/gruntwork-io/terragrunt/internal/errors"
"github.com/gruntwork-io/terragrunt/pkg/log"
"github.com/gruntwork-io/terragrunt/tf"
"github.com/hashicorp/go-getter"
"github.com/hashicorp/go-getter/v2"
"gopkg.in/ini.v1"
)

Expand All @@ -27,13 +27,17 @@ const (
azuredevHost = "dev.azure.com"
bitbucketHost = "bitbucket.org"
gitlabSelfHostedRegex = `^(gitlab\.(.+))$`

cloneCompleteSentinel = ".catalog-clone-complete"
)

var (
gitHeadBranchNameReg = regexp.MustCompile(`^.*?([^/]+)$`)
repoNameFromCloneURLReg = regexp.MustCompile(`(?i)^.*?([-a-z_.]+)[^/]*?(?:\.git)?$`)
repoNameFromCloneURLReg = regexp.MustCompile(`(?i)^.*?([-a-z0-9_.]+)[^/]*?(?:\.git)?$`)

modulesPaths = []string{"modules"}

includedGitFiles = []string{"HEAD", "config"}
)

type Repo struct {
Expand All @@ -46,17 +50,19 @@ type Repo struct {
BranchName string

walkWithSymlinks bool
allowCAS bool
}

func NewRepo(ctx context.Context, logger log.Logger, cloneURL, tempDir string, walkWithSymlinks bool) (*Repo, error) {
func NewRepo(ctx context.Context, l log.Logger, cloneURL, path string, walkWithSymlinks bool, allowCAS bool) (*Repo, error) {
repo := &Repo{
logger: logger,
logger: l,
cloneURL: cloneURL,
path: tempDir,
path: path,
walkWithSymlinks: walkWithSymlinks,
allowCAS: allowCAS,
}

if err := repo.clone(ctx); err != nil {
if err := repo.clone(ctx, l); err != nil {
return nil, err
}

Expand Down Expand Up @@ -163,69 +169,164 @@ func (repo *Repo) ModuleURL(moduleDir string) (string, error) {
return "", errors.Errorf("hosting: %q is not supported yet", remote.Host)
}

// clone clones the repository to a temporary directory if the repoPath is URL
func (repo *Repo) clone(ctx context.Context) error {
type CloneOptions struct {
SourceURL string
TargetPath string
Context context.Context
Logger log.Logger
}

func (repo *Repo) clone(ctx context.Context, l log.Logger) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what do you pass l log.Logger as arg of the func when you can use repo.logger in all nested funcs?

cloneURL, err := repo.resolveCloneURL()
if err != nil {
return err
}

// Handle local directory case
if files.IsDir(cloneURL) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if files.IsDir(cloneURL) {
if util.IsDir(cloneURL) {

I know it's not your code, but we need to use one thing everywhere, if go-commons, then remove duplicate functions from the util package.

return repo.handleLocalDir(cloneURL)
}

// Prepare clone options
opts := CloneOptions{
SourceURL: cloneURL,
TargetPath: repo.path,
Context: ctx,
Logger: repo.logger,
}

if err := repo.prepareCloneDirectory(); err != nil {
return err
}

if repo.cloneCompleted() {
repo.logger.Debugf("The repo dir exists and %q exists. Skipping cloning.", cloneCompleteSentinel)

return nil
}

return repo.performClone(ctx, l, &opts)
}

func (repo *Repo) resolveCloneURL() (string, error) {
if repo.cloneURL == "" {
currentDir, err := os.Getwd()
if err != nil {
return errors.New(err)
return "", errors.New(err)
}

repo.cloneURL = currentDir
return currentDir, nil
}

if repoPath := repo.cloneURL; files.IsDir(repoPath) {
if !filepath.IsAbs(repoPath) {
absRepoPath, err := filepath.Abs(repoPath)
if err != nil {
return errors.New(err)
}
return repo.cloneURL, nil
}
Comment on lines +211 to +222
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Double-check potential data loss during re-cloning.

prepareCloneDirectory removes the existing directory if the sentinel file isn’t present. If users mistakenly pointed at a directory with important data, it would be wiped out. Consider adding a confirmation mechanism or additional checks to ensure the directory is truly safe to remove.

Also applies to: 237-253, 255-273


repo.logger.Debugf("Converting relative path %q to absolute %q", repoPath, absRepoPath)
func (repo *Repo) handleLocalDir(repoPath string) error {
if !filepath.IsAbs(repoPath) {
absRepoPath, err := filepath.Abs(repoPath)
if err != nil {
return errors.New(err)
}

repo.path = repoPath
repo.logger.Debugf("Converting relative path %q to absolute %q", repoPath, absRepoPath)
repo.path = absRepoPath

return nil
}

repo.path = repoPath

return nil
}

func (repo *Repo) prepareCloneDirectory() error {
if err := os.MkdirAll(repo.path, os.ModePerm); err != nil {
return errors.New(err)
}

repoName := repo.extractRepoName()
repo.path = filepath.Join(repo.path, repoName)

// Clean up incomplete clones
if repo.shouldCleanupIncompleteClone() {
repo.logger.Debugf("The repo dir exists but %q does not. Removing the repo dir for cloning from the remote source.", cloneCompleteSentinel)

if err := os.RemoveAll(repo.path); err != nil {
return errors.New(err)
}
}

return nil
}

func (repo *Repo) extractRepoName() string {
repoName := "temp"
if match := repoNameFromCloneURLReg.FindStringSubmatch(repo.cloneURL); len(match) > 0 && match[1] != "" {
repoName = match[1]
}

repo.path = filepath.Join(repo.path, repoName)
return repoName
}

// Since we are cloning the repository into a temporary directory, some operating systems such as MacOS have a service for deleting files that have not been accessed for a long time.
// For example, in MacOS the service is responsible for deleting unused files deletes only files while leaving the directory structure is untouched, which in turn misleads `go-getter`, which thinks that the repository exists but cannot update it due to the lack of files. In such cases, we simply delete the temporary directory in order to clone the one again.
// See https://github.com/gruntwork-io/terragrunt/pull/2888
if files.FileExists(repo.path) && !files.FileExists(repo.gitHeadfile()) {
repo.logger.Debugf("The repo dir exists but git file %q does not. Removing the repo dir for cloning from the remote source.", repo.gitHeadfile())
func (repo *Repo) shouldCleanupIncompleteClone() bool {
return files.FileExists(repo.path) && !repo.cloneCompleted()
}

if err := os.RemoveAll(repo.path); err != nil {
return errors.New(err)
func (repo *Repo) cloneCompleted() bool {
return files.FileExists(filepath.Join(repo.path, cloneCompleteSentinel))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return files.FileExists(filepath.Join(repo.path, cloneCompleteSentinel))
return util.FileExists(filepath.Join(repo.path, cloneCompleteSentinel))

}

func (repo *Repo) performClone(ctx context.Context, l log.Logger, opts *CloneOptions) error {
client := getter.DefaultClient

if repo.allowCAS {
c, err := cas.New(cas.Options{})
if err != nil {
return err
}

cloneOpts := cas.CloneOptions{
Dir: repo.path,
IncludedGitFiles: includedGitFiles,
}

client.Getters = append([]getter.Getter{cas.NewCASGetter(&l, c, &cloneOpts)}, client.Getters...)
}

sourceURL, err := tf.ToSourceURL(repo.cloneURL, "")
sourceURL, err := tf.ToSourceURL(opts.SourceURL, "")
if err != nil {
return err
}

repo.cloneURL = sourceURL.String()
opts.Logger.Infof("Cloning repository %q to temporary directory %q", repo.cloneURL, repo.path)

repo.logger.Infof("Cloning repository %q to temporary directory %q", repo.cloneURL, repo.path)
// Check first if the query param ref is already set
q := sourceURL.Query()

// We need to explicitly specify the reference, otherwise we will get an error:
// "fatal: The empty string is not a valid pathspec. Use . instead if you wanted to match all paths"
// when updating an existing repository.
sourceURL.RawQuery = (url.Values{"ref": []string{"HEAD"}}).Encode()
ref := q.Get("ref")
if ref != "" {
q.Set("ref", "HEAD")
}

sourceURL.RawQuery = q.Encode()

if err := getter.Get(repo.path, strings.Trim(sourceURL.String(), "/"), getter.WithContext(ctx), getter.WithMode(getter.ClientModeDir)); err != nil {
_, err = client.Get(ctx, &getter.Request{
Src: sourceURL.String(),
Dst: repo.path,
GetMode: getter.ModeDir,
})
if err != nil {
return err
}

// Create the sentinel file to indicate that the clone is complete
f, err := os.Create(filepath.Join(repo.path, cloneCompleteSentinel))
if err != nil {
return errors.New(err)
}

if err := f.Close(); err != nil {
return errors.New(err)
}

Expand All @@ -237,7 +338,7 @@ func (repo *Repo) parseRemoteURL() error {
gitConfigPath := filepath.Join(repo.path, ".git", "config")

if !files.FileExists(gitConfigPath) {
return errors.Errorf("the specified path %q is not a git repository", repo.path)
return errors.Errorf("the specified path %q is not a git repository (no .git/config file found)", repo.path)
}

repo.logger.Debugf("Parsing git config %q", gitConfigPath)
Expand Down Expand Up @@ -280,11 +381,12 @@ func (repo *Repo) gitHeadfile() string {
func (repo *Repo) parseBranchName() error {
data, err := files.ReadFileAsString(repo.gitHeadfile())
if err != nil {
return errors.Errorf("the specified path %q is not a git repository", repo.path)
return errors.Errorf("the specified path %q is not a git repository (no .git/HEAD file found)", repo.path)
}

if match := gitHeadBranchNameReg.FindStringSubmatch(data); len(match) > 0 {
repo.BranchName = strings.TrimSpace(match[1])

return nil
}

Expand Down
2 changes: 1 addition & 1 deletion cli/commands/catalog/module/repo_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ func TestFindModules(t *testing.T) {

ctx := context.Background()

repo, err := module.NewRepo(ctx, log.New(), testCase.repoPath, "", false)
repo, err := module.NewRepo(ctx, log.New(), testCase.repoPath, "", false, false)
require.NoError(t, err)

modules, err := repo.FindModules(ctx)
Expand Down
39 changes: 39 additions & 0 deletions docs-starlight/src/content/docs/02-features/14-cas.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Content Addressable Store (CAS)
description: Learn how Terragrunt supports deduplication of content using a Content Addressable Store (CAS).
slug: docs/features/cas
sidebar:
order: 14
---

Terragrunt supports a Content Addressable Store (CAS) to deduplicate content across multiple Terragrunt configurations. This feature is still experimental and not recommended for general production usage.

At the moment, the only supported use case for the CAS is to speed up catalog cloning. In the future, the CAS can be used to store more content.

To use the CAS, you will need to enable the [cas](/docs/reference/experiments/#cas) experiment.

## Usage

When you enable the `cas` experiment, Terragrunt will automatically use the CAS when cloning any compatible source (right now, only Git repositories).

```hcl
# root.hcl

catalog {
urls = [
"[email protected]:acme/modules.git"
]
}
```

When Terragrunt clones a repository while using the CAS. If the repository is not found in the CAS, Terragrunt will clone the repository from the original URL and store it in the CAS for future use.

When generating a repository from the CAS, Terragrunt will hard link entries from the CAS to the new repository. This allows Terragrunt to deduplicate content across multiple repositories.

In the event that hard linking fails due to some operating system / host incompatibility with hard links, Terragrunt will fall back to performing copies of the content from the CAS.

## Storage

The CAS is stored in the `~/.cache/terragrunt/cas` directory. This directory can be safely deleted at any time, as Terragrunt will automatically regenerate the CAS as needed.

Avoid partial deletions of the CAS directory without care, as that might result in partially cloned repositories and unexpected behavior.
26 changes: 26 additions & 0 deletions docs-starlight/src/content/docs/04-reference/04-experiments.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,3 +166,29 @@ To transition `cli-redesign` features to a stable release, the following must be
- [ ] Add support for `find` with the `exclude` block used to exclude units from the search.
- [ ] Add integration with `symlinks` experiment to support finding units/stacks via symlinks.
- [ ] Add support for the `list` command.

### `cas`

Support for Terragrunt Content Addressable Storage (CAS).

#### `cas` - What it does

Allow Terragrunt to store and retrieve state files from a Content Addressable Storage (CAS) system.

At the moment, the CAS is only used to speed up catalog cloning, but in the future, it can be used to store more content.

#### `cas` - How to provide feedback

Share your experience with this feature in the [CAS](https://github.com/gruntwork-io/terragrunt/discussions/3939) Feedback GitHub Discussion.
Feedback is crucial for ensuring the feature meets real-world use cases. Please include:

- Any bugs or issues encountered (including logs or stack traces if possible).
- Suggestions for additional improvements or enhancements.

#### `cas` - Criteria for stabilization

To transition the `cas` feature to a stable release, the following must be addressed:

- [x] Add support for storing and retrieving catalog repositories from the CAS.
- [ ] Add support for storing and retrieving OpenTofu/Terraform modules from the CAS.
- [ ] Add support for storing and retrieving Unit/Stack configurations from the CAS.
Loading