Skip to content

change snapshot to include worker template and set the registry path based on the template #325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Mandukhai-Alimaa
Copy link
Contributor

No description provided.

@@ -64,6 +68,15 @@ func NewGcpWorkerPool() *WorkerPool {
}
fmt.Printf("Instance: %s\n", instance)

fmt.Printf("STEP 2a: prepare snapshot with GCS lambda store config\n")
if err := createGcsTemplate(); err != nil {
panic(fmt.Errorf("failed to create GCS template.json: %v", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%w

panic(fmt.Errorf("failed to create GCS template.json: %v", err))
}

// Brief pause to ensure filesystem operations are fully committed before snapshot
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the change is buffered for longer that 1s? Can we force it to disk with a sync? https://pkg.go.dev/os#File.Sync

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you do sync below. Was it not working already, without the sleep? We need to understand this a bit better (I don't trust sleep to robustly solve any problem remaining).


// Sync the directory to ensure the rename is persisted before snapshot
dirFile, err := os.Open(filepath.Dir(filePath))
if err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check if there is an error?

@@ -305,6 +355,64 @@ func SandboxConfJson() string {
return string(s)
}

// SaveConfigAtomic saves a config to a file atomically to prevent race conditions
// This is used for template.json creation to avoid corruption during concurrent access
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually have concurrent access, or is the real concern that the data is not flushed to disk before the disk snapshot begins?

Is there any downside to just having a regular SaveConfig that always takes special care with sync/rename?

dir := filepath.Dir(filePath)
tempFile, err := os.CreateTemp(dir, filepath.Base(filePath)+".tmp.*")
if err != nil {
return fmt.Errorf("failed to create temp file: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%w

func SaveConfigAtomic(cfg *Config, filePath string) error {
// Create temp file in same directory as target file
dir := filepath.Dir(filePath)
tempFile, err := os.CreateTemp(dir, filepath.Base(filePath)+".tmp.*")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename is frequently used to atomically update a file, but I'm not sure it will work correctly if we're renaming across different file systems. I'm not sure the temp file will always be on the same FS as the final final. Perhaps we should write to filePath + ".tmp" so that the write goes to the same dir/FS.


// Atomic rename - this is the key operation that prevents corruption
if err := os.Rename(tempPath, filePath); err != nil {
return fmt.Errorf("failed to rename temp file: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%w

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and all other places too

}

// Close temp file
if err := tempFile.Close(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little complicated with Close in two places. But I understand why you're doing it: because you want Close to happen before the rename (not when this function returns). Maybe we should move the file write logic out to a new file, so that a defer Close() that runs at the end is appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants