-
Notifications
You must be signed in to change notification settings - Fork 129
change snapshot to include worker template and set the registry path based on the template #325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
change snapshot to include worker template and set the registry path based on the template #325
Conversation
@@ -64,6 +68,15 @@ func NewGcpWorkerPool() *WorkerPool { | |||
} | |||
fmt.Printf("Instance: %s\n", instance) | |||
|
|||
fmt.Printf("STEP 2a: prepare snapshot with GCS lambda store config\n") | |||
if err := createGcsTemplate(); err != nil { | |||
panic(fmt.Errorf("failed to create GCS template.json: %v", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%w
panic(fmt.Errorf("failed to create GCS template.json: %v", err)) | ||
} | ||
|
||
// Brief pause to ensure filesystem operations are fully committed before snapshot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the change is buffered for longer that 1s? Can we force it to disk with a sync? https://pkg.go.dev/os#File.Sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see you do sync below. Was it not working already, without the sleep? We need to understand this a bit better (I don't trust sleep to robustly solve any problem remaining).
|
||
// Sync the directory to ensure the rename is persisted before snapshot | ||
dirFile, err := os.Open(filepath.Dir(filePath)) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we check if there is an error?
@@ -305,6 +355,64 @@ func SandboxConfJson() string { | |||
return string(s) | |||
} | |||
|
|||
// SaveConfigAtomic saves a config to a file atomically to prevent race conditions | |||
// This is used for template.json creation to avoid corruption during concurrent access |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually have concurrent access, or is the real concern that the data is not flushed to disk before the disk snapshot begins?
Is there any downside to just having a regular SaveConfig that always takes special care with sync/rename?
dir := filepath.Dir(filePath) | ||
tempFile, err := os.CreateTemp(dir, filepath.Base(filePath)+".tmp.*") | ||
if err != nil { | ||
return fmt.Errorf("failed to create temp file: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%w
func SaveConfigAtomic(cfg *Config, filePath string) error { | ||
// Create temp file in same directory as target file | ||
dir := filepath.Dir(filePath) | ||
tempFile, err := os.CreateTemp(dir, filepath.Base(filePath)+".tmp.*") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename is frequently used to atomically update a file, but I'm not sure it will work correctly if we're renaming across different file systems. I'm not sure the temp file will always be on the same FS as the final final. Perhaps we should write to filePath + ".tmp" so that the write goes to the same dir/FS.
|
||
// Atomic rename - this is the key operation that prevents corruption | ||
if err := os.Rename(tempPath, filePath); err != nil { | ||
return fmt.Errorf("failed to rename temp file: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%w
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and all other places too
} | ||
|
||
// Close temp file | ||
if err := tempFile.Close(); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little complicated with Close in two places. But I understand why you're doing it: because you want Close to happen before the rename (not when this function returns). Maybe we should move the file write logic out to a new file, so that a defer Close() that runs at the end is appropriate.
No description provided.