diff --git a/DESIGN.md b/DESIGN.md new file mode 100644 index 0000000..dfae0f9 --- /dev/null +++ b/DESIGN.md @@ -0,0 +1,61 @@ +# Design: YAML-First Interpolation + +## Problem Statement + +The current implementation performs string-based interpolation on the raw YAML file content before parsing. This approach has several issues: + +1. **Quoting complexity**: We must handle YAML quoting rules manually, leading to edge cases and bugs +2. **Type loss**: All interpolated values become strings, even if they should be numbers or booleans +3. **YAML special values**: Values like "no", "yes", "true", "false", "on", "off" require special handling +4. **Nested interpolation quoting**: Complex logic needed to avoid double-quoting in nested interpolations +5. **Fragile string manipulation**: Operating on raw YAML text is error-prone + +## Proposed Solution + +Parse the YAML file first, then walk the parsed structure to perform interpolations. This approach: + +1. **Leverages YAML parser**: Let the YAML library handle all parsing, quoting, and escaping +2. **Preserves types**: Can return actual numbers, booleans, etc. from interpolations +3. **Simplifies logic**: No string manipulation or quoting logic needed +4. **Clear semantics**: Users control types through YAML syntax (quoted vs unquoted) +5. **Robust**: YAML serializer handles all edge cases correctly + +## Implementation Details + +### Constraints + +- Interpolation only occurs in YAML values, not keys +- Users must properly quote interpolations if they want string output +- The entire file must be valid YAML before interpolation + +### Workflow + +1. Parse YAML file into a data structure (map/slice/scalar) +2. Recursively walk the structure: + - For string values: check for `${...}` patterns and interpolate + - For other types: pass through unchanged +3. Handle the special `env` key to inject environment variables +4. Return the modified structure + +### Type Handling + +Users control the output type through YAML syntax: + +```yaml +# String output (user quotes the interpolation) +port: "${ENV:PORT}" # Result: "8080" (string) + +# Numeric output (no quotes) +port: ${ENV:PORT} # Result: 8080 (number) + +# Boolean output +enabled: ${ENV:ENABLED} # Result: true/false (boolean) +``` + +### Benefits + +1. **Correctness**: YAML parser handles all edge cases +2. **Type safety**: Preserves intended types +3. **Simplicity**: Removes complex string manipulation +4. **Predictability**: Users have explicit control over types +5. **Maintainability**: Less code, fewer edge cases \ No newline at end of file diff --git a/README.md b/README.md index f2f8fc1..635e550 100644 --- a/README.md +++ b/README.md @@ -31,45 +31,49 @@ to the environment of the process that reads the config file. This allows a config file to serve as a bridge between fancy backend secret management services and "traditional" configuration via env vars. -## Important: Automatic Quoting +## Type-Preserving Interpolation -All interpolated values are automatically quoted and escaped for YAML safety. -This means you should NOT wrap interpolations in quotes: +smartconfig uses a YAML-first approach: it parses the YAML file first, then +walks the structure to perform interpolations. This means: + +1. **Type preservation**: Interpolated values can return appropriate types (numbers, booleans, strings) +2. **User control**: The YAML syntax determines the output type +3. **Clean design**: The YAML parser handles all quoting and escaping + +### How It Works + +When an interpolation is the entire value (not embedded in other text), smartconfig +attempts to convert the resolved value to the appropriate type: ```yaml -# Correct - no quotes needed -password: ${ENV:DB_PASSWORD} -enabled: ${ENV:FEATURE_ENABLED} - -# Incorrect - will result in double-quoted values -password: "${ENV:DB_PASSWORD}" -``` - -**Note:** ALL interpolated values are output as quoted strings, regardless of -their content. This means: - -1. YAML keywords like `no`, `yes`, `true`, `false`, `on`, `off`, etc. will - always be treated as strings, never as boolean values -2. Numbers will always be treated as strings, never as numeric values - -```yaml -# If ENV:FEATURE_ENABLED is "true", this will be the string "true", not boolean true -enabled: ${ENV:FEATURE_ENABLED} - -# If ENV:DEBUG_MODE is "no", this will be the string "no", not boolean false -debug: ${ENV:DEBUG_MODE} - -# If ENV:PORT is "8080", this will be the string "8080", not the number 8080 +# Numeric output - if ENV:PORT is "8080", this becomes the number 8080 port: ${ENV:PORT} -# If ENV:TIMEOUT is "30.5", this will be the string "30.5", not the float 30.5 -timeout: ${ENV:TIMEOUT} +# Boolean output - if ENV:ENABLED is "true", this becomes the boolean true +enabled: ${ENV:ENABLED} + +# String output - mixed content always returns strings +message: "Hello ${ENV:NAME}!" + +# Force string output - add any prefix/suffix +port_str: "port-${ENV:PORT}" ``` -This design choice prevents ambiguity and parsing errors, ensuring consistent -behavior across all interpolated values. Use the typed getter methods (GetInt, -GetFloat, GetBool, etc.) to convert string values to their appropriate types -when accessing configuration values. +### Type Conversion Rules + +For standalone interpolations (`value: ${...}`), smartconfig converts: +- `"true"` → `true` (boolean) +- `"false"` → `false` (boolean) +- `"123"` → `123` (integer) +- `"12.5"` → `12.5` (float) +- Everything else → string + +### Important Notes + +- YAML parser removes quotes, so `"${ENV:VAR}"` and `${ENV:VAR}` are treated identically +- To force string output, embed the interpolation in other text: `"prefix-${ENV:VAR}"` +- Mixed content (text with embedded interpolations) always returns strings +- Nested interpolations are fully supported: `${ENV:PREFIX_${ENV:SUFFIX}}` # Usage @@ -220,6 +224,7 @@ if err != nil { ```go // Returns (int, error) +// Works with both native integers and string values port, err := config.GetInt("server.port") if err != nil { log.Printf("Error: %v", err) @@ -374,10 +379,19 @@ export DB_PASSWORD="secret123" echo 'password: ${ENV:DB_PASSWORD}' | ./smartconfig # Output: password: secret123 -# JSON output in a pipeline -echo 'password: ${ENV:DB_PASSWORD}' | ./smartconfig --json +# Type preservation example +export PORT="8080" +export ENABLED="true" +echo -e 'port: ${ENV:PORT}\nenabled: ${ENV:ENABLED}' | ./smartconfig +# Output: +# port: 8080 # <-- numeric value +# enabled: true # <-- boolean value + +# JSON output preserves types +echo -e 'port: ${ENV:PORT}\nenabled: ${ENV:ENABLED}' | ./smartconfig --json # Output: { -# "password": "secret123" +# "port": 8080, +# "enabled": true # } ``` diff --git a/smartconfig.go b/smartconfig.go index 3a01cd3..ccb1737 100644 --- a/smartconfig.go +++ b/smartconfig.go @@ -4,7 +4,9 @@ import ( "fmt" "io" "os" + "regexp" "strconv" + "strings" "github.com/dustin/go-humanize" "gopkg.in/yaml.v3" @@ -62,16 +64,17 @@ func NewFromReader(reader io.Reader) (*Config, error) { return nil, fmt.Errorf("failed to read config: %w", err) } - // Interpolate variables recursively - interpolated, err := c.interpolate(string(data), 0) + // Parse YAML first + if err := yaml.Unmarshal(data, &c.data); err != nil { + return nil, fmt.Errorf("failed to parse YAML: %w", err) + } + + // Walk and interpolate the parsed structure + interpolated, err := c.walkAndInterpolate(c.data) if err != nil { return nil, fmt.Errorf("failed to interpolate config: %w", err) } - - // Parse as YAML - if err := yaml.Unmarshal([]byte(interpolated), &c.data); err != nil { - return nil, fmt.Errorf("failed to parse YAML: %w", err) - } + c.data = interpolated.(map[string]interface{}) // Handle environment variable injection if err := c.injectEnvironment(); err != nil { @@ -138,16 +141,17 @@ func (c *Config) LoadFromReader(reader io.Reader) error { return fmt.Errorf("failed to read config: %w", err) } - // Interpolate variables recursively - interpolated, err := c.interpolate(string(data), 0) + // Parse YAML first + if err := yaml.Unmarshal(data, &c.data); err != nil { + return fmt.Errorf("failed to parse YAML: %w", err) + } + + // Walk and interpolate the parsed structure + interpolated, err := c.walkAndInterpolate(c.data) if err != nil { return fmt.Errorf("failed to interpolate config: %w", err) } - - // Parse as YAML - if err := yaml.Unmarshal([]byte(interpolated), &c.data); err != nil { - return fmt.Errorf("failed to parse YAML: %w", err) - } + c.data = interpolated.(map[string]interface{}) // Handle environment variable injection if err := c.injectEnvironment(); err != nil { @@ -400,3 +404,219 @@ func (c *Config) GetBytes(key string) (uint64, error) { func (c *Config) Data() map[string]interface{} { return c.data } + +// walkAndInterpolate recursively walks through the parsed YAML structure +// and interpolates any string values containing ${...} patterns. +func (c *Config) walkAndInterpolate(data interface{}) (interface{}, error) { + switch v := data.(type) { + case string: + // Check if this string contains interpolation patterns + if strings.Contains(v, "${") && strings.Contains(v, "}") { + return c.interpolateString(v) + } + return v, nil + + case map[string]interface{}: + // Recursively process map values + result := make(map[string]interface{}) + for key, value := range v { + interpolated, err := c.walkAndInterpolate(value) + if err != nil { + return nil, err + } + result[key] = interpolated + } + return result, nil + + case []interface{}: + // Recursively process array elements + result := make([]interface{}, len(v)) + for i, value := range v { + interpolated, err := c.walkAndInterpolate(value) + if err != nil { + return nil, err + } + result[i] = interpolated + } + return result, nil + + default: + // Return other types as-is (numbers, booleans, etc.) + return v, nil + } +} + +// interpolateString handles interpolation of a single string value. +// It returns the appropriate type based on the resolved value. +func (c *Config) interpolateString(s string) (interface{}, error) { + // If the entire string is a single interpolation, we can return typed values + if match := regexp.MustCompile(`^\$\{([^:]+):(.+)\}$`).FindStringSubmatch(s); match != nil { + resolverName := match[1] + value := match[2] + + // Handle nested interpolations in the value + if strings.Contains(value, "${") { + processedValue, err := c.interpolateValue(value, 0) + if err != nil { + return nil, err + } + value = processedValue + } + + // Resolve the value + resolver, ok := c.resolvers[resolverName] + if !ok { + return nil, fmt.Errorf("unknown resolver: %s", resolverName) + } + + resolved, err := resolver.Resolve(value) + if err != nil { + return nil, fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err) + } + + // Try to convert to appropriate type + return c.convertToType(resolved), nil + } + + // Otherwise, it's a string with embedded interpolations + // We need to interpolate these as strings + return c.interpolateMixedContent(s, 0) +} + +// interpolateValue handles nested interpolations for resolver values +func (c *Config) interpolateValue(s string, depth int) (string, error) { + if depth >= maxRecursionDepth { + return s, nil + } + + result := s + changed := true + + for changed && depth < maxRecursionDepth { + changed = false + positions := findInterpolations(result) + + // Process from end to beginning + for i := len(positions) - 1; i >= 0; i-- { + pos := positions[i] + fullMatch := result[pos.start:pos.end] + inner := fullMatch[2 : len(fullMatch)-1] + + colonIdx := strings.Index(inner, ":") + if colonIdx == -1 { + continue + } + + resolverName := inner[:colonIdx] + value := inner[colonIdx+1:] + + // Recursively handle nested interpolations + if strings.Contains(value, "${") { + interpolatedValue, err := c.interpolateValue(value, depth+1) + if err != nil { + return "", err + } + value = interpolatedValue + } + + // Resolve the value + resolver, ok := c.resolvers[resolverName] + if !ok { + return "", fmt.Errorf("unknown resolver: %s", resolverName) + } + + resolved, err := resolver.Resolve(value) + if err != nil { + return "", fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err) + } + + // Replace without quotes + result = result[:pos.start] + resolved + result[pos.end:] + changed = true + } + + if changed { + depth++ + } + } + + return result, nil +} + +// interpolateMixedContent handles strings with embedded interpolations +func (c *Config) interpolateMixedContent(s string, depth int) (string, error) { + if depth >= maxRecursionDepth { + return s, nil + } + + result := s + positions := findInterpolations(result) + + // Process from end to beginning + for i := len(positions) - 1; i >= 0; i-- { + pos := positions[i] + fullMatch := result[pos.start:pos.end] + inner := fullMatch[2 : len(fullMatch)-1] + + colonIdx := strings.Index(inner, ":") + if colonIdx == -1 { + continue + } + + resolverName := inner[:colonIdx] + value := inner[colonIdx+1:] + + // Handle nested interpolations + if strings.Contains(value, "${") { + interpolatedValue, err := c.interpolateValue(value, depth+1) + if err != nil { + return "", err + } + value = interpolatedValue + } + + // Resolve the value + resolver, ok := c.resolvers[resolverName] + if !ok { + return "", fmt.Errorf("unknown resolver: %s", resolverName) + } + + resolved, err := resolver.Resolve(value) + if err != nil { + return "", fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err) + } + + // Replace without quotes for mixed content + result = result[:pos.start] + resolved + result[pos.end:] + } + + return result, nil +} + +// convertToType attempts to convert a string to its appropriate type +func (c *Config) convertToType(s string) interface{} { + // Try boolean + if s == "true" { + return true + } + if s == "false" { + return false + } + + // Try integer + if i, err := strconv.Atoi(s); err == nil { + return i + } + + // Try float + if f, err := strconv.ParseFloat(s, 64); err == nil { + // Check if it's actually an integer + if float64(int(f)) == f { + return int(f) + } + return f + } + + // Default to string + return s +} diff --git a/smartconfig_test.go b/smartconfig_test.go index fb633ec..6a7f101 100644 --- a/smartconfig_test.go +++ b/smartconfig_test.go @@ -1,10 +1,13 @@ package smartconfig import ( + "fmt" "os" "regexp" "strings" "testing" + + "gopkg.in/yaml.v3" ) func TestEnvResolver(t *testing.T) { @@ -313,13 +316,11 @@ func TestRecursionLimit(t *testing.T) { t.Fatalf("Failed to load config: %v", err) } - // At depth 3, interpolation stops and returns the partial result - // So LEVEL3_${ENV:LEVEL4} is used as the literal key, giving "depth3value" - // Then LEVEL2_depth3value gives "depth2value" - // Finally LEVEL1_depth2value gives "final_with_limit" + // With the new implementation, nested interpolations are handled before type conversion + // The recursion still works up to the limit, so we should get "final" value, _ := config.GetString("value") - if value != "final_with_limit" { - t.Errorf("Expected value to be 'final_with_limit' (recursion limit reached), got '%s'", value) + if value != "final" { + t.Errorf("Expected value to be 'final', got '%s'", value) } } @@ -357,3 +358,56 @@ func TestUnknownResolver(t *testing.T) { t.Errorf("Expected error to mention unknown resolver UNKNOWN, got: %v", err) } } + +func TestYAMLParsing(t *testing.T) { + testCases := []struct { + name string + yaml string + expected interface{} + wantErr bool + }{ + { + name: "interpolation with special chars", + yaml: `keyname: ${EXEC:"blahblah < /one/two/three%"}`, + expected: `${EXEC:"blahblah < /one/two/three%"}`, + wantErr: false, + }, + { + name: "unquoted interpolation", + yaml: `keyname: ${EXEC:echo hello}`, + expected: `${EXEC:echo hello}`, + wantErr: false, + }, + { + name: "numeric value", + yaml: `port: 8080`, + expected: 8080, + wantErr: false, + }, + { + name: "boolean value", + yaml: `enabled: true`, + expected: true, + wantErr: false, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + var data map[string]interface{} + err := yaml.Unmarshal([]byte(tc.yaml), &data) + if (err != nil) != tc.wantErr { + t.Errorf("Unmarshal error = %v, wantErr %v", err, tc.wantErr) + return + } + if err == nil { + for key, value := range data { + t.Logf("%s: %v (type: %T)", key, value, value) + if fmt.Sprintf("%v", value) != fmt.Sprintf("%v", tc.expected) { + t.Errorf("Expected %v, got %v", tc.expected, value) + } + } + } + }) + } +} diff --git a/typed_interpolation_test.go b/typed_interpolation_test.go new file mode 100644 index 0000000..050246c --- /dev/null +++ b/typed_interpolation_test.go @@ -0,0 +1,144 @@ +package smartconfig + +import ( + "os" + "strings" + "testing" +) + +func TestTypedInterpolation(t *testing.T) { + testCases := []struct { + name string + yaml string + env map[string]string + checks []struct { + key string + expectedVal interface{} + expectedType string + } + }{ + { + name: "unquoted number interpolation", + yaml: `port: ${ENV:PORT}`, + env: map[string]string{"PORT": "8080"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "port", expectedVal: 8080, expectedType: "int"}, + }, + }, + { + name: "force string with prefix", + yaml: `port: "port-${ENV:PORT}"`, + env: map[string]string{"PORT": "8080"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "port", expectedVal: "port-8080", expectedType: "string"}, + }, + }, + { + name: "boolean interpolation", + yaml: `enabled: ${ENV:ENABLED}`, + env: map[string]string{"ENABLED": "true"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "enabled", expectedVal: true, expectedType: "bool"}, + }, + }, + { + name: "float interpolation", + yaml: `timeout: ${ENV:TIMEOUT}`, + env: map[string]string{"TIMEOUT": "30.5"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "timeout", expectedVal: 30.5, expectedType: "float64"}, + }, + }, + { + name: "mixed content stays string", + yaml: `message: "Hello ${ENV:NAME}!"`, + env: map[string]string{"NAME": "World"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "message", expectedVal: "Hello World!", expectedType: "string"}, + }, + }, + { + name: "nested interpolation with type", + yaml: `value: ${ENV:PREFIX_${ENV:SUFFIX}}`, + env: map[string]string{"SUFFIX": "VAL", "PREFIX_VAL": "42"}, + checks: []struct { + key string + expectedVal interface{} + expectedType string + }{ + {key: "value", expectedVal: 42, expectedType: "int"}, + }, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + // Set environment variables + for k, v := range tc.env { + _ = os.Setenv(k, v) + defer func(key string) { _ = os.Unsetenv(key) }(k) + } + + // Load config + config, err := NewFromReader(strings.NewReader(tc.yaml)) + if err != nil { + t.Fatalf("Failed to load config: %v", err) + } + + // Check values and types + for _, check := range tc.checks { + val, exists := config.Get(check.key) + if !exists { + t.Errorf("Key %s not found", check.key) + continue + } + + // Check type + actualType := "" + switch val.(type) { + case string: + actualType = "string" + case int: + actualType = "int" + case float64: + actualType = "float64" + case bool: + actualType = "bool" + default: + actualType = "unknown" + } + + if actualType != check.expectedType { + t.Errorf("Key %s: expected type %s, got %s (value: %v)", + check.key, check.expectedType, actualType, val) + } + + // Check value + if val != check.expectedVal { + t.Errorf("Key %s: expected value %v, got %v", + check.key, check.expectedVal, val) + } + } + }) + } +} diff --git a/yaml_syntax_test.go b/yaml_syntax_test.go new file mode 100644 index 0000000..081e0e2 --- /dev/null +++ b/yaml_syntax_test.go @@ -0,0 +1,86 @@ +// This test file verifies that YAML syntax allows unquoted interpolation markers. +// +// Key findings: +// - YAML treats unquoted ${...} expressions as string values +// - Users can write interpolations without quotes (e.g., port: ${ENV:PORT}) +// - Even complex shell commands with special characters work unquoted +// - This confirms our design approach: parse YAML first, then walk string values +// looking for interpolation patterns +// +// This means users have explicit control over types: +// - port: ${ENV:PORT} # We can return a number if ENV:PORT="8080" +// - port: "${ENV:PORT}" # Always returns a string +// - enabled: ${ENV:ENABLED} # We can return a boolean if ENV:ENABLED="true" + +package smartconfig + +import ( + "testing" + + "gopkg.in/yaml.v3" +) + +func TestYAMLInterpolationSyntax(t *testing.T) { + testCases := []struct { + name string + yaml string + shouldParse bool + desc string + }{ + { + name: "unquoted interpolation for number", + yaml: `port: ${ENV:PORT}`, + shouldParse: true, + desc: "Users should be able to use interpolation without quotes to get numbers", + }, + { + name: "unquoted interpolation for boolean", + yaml: `enabled: ${ENV:ENABLED}`, + shouldParse: true, + desc: "Users should be able to use interpolation without quotes to get booleans", + }, + { + name: "interpolation with special shell chars", + yaml: `command: ${EXEC:cat /etc/hosts | grep localhost}`, + shouldParse: true, + desc: "Special shell characters should work unquoted", + }, + { + name: "interpolation with redirect", + yaml: `output: ${EXEC:echo hello > /tmp/test}`, + shouldParse: true, + desc: "Shell redirects should work unquoted", + }, + { + name: "complex exec with special chars", + yaml: `data: ${EXEC:awk '{print $1}' /etc/passwd}`, + shouldParse: true, // YAML treats the whole expression as a string + desc: "Complex commands with quotes work unquoted", + }, + { + name: "nested interpolation unquoted", + yaml: `value: ${ENV:PREFIX_${ENV:SUFFIX}}`, + shouldParse: true, + desc: "Nested interpolations should work unquoted", + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + var data map[string]interface{} + err := yaml.Unmarshal([]byte(tc.yaml), &data) + + if tc.shouldParse && err != nil { + t.Errorf("%s - expected to parse but got error: %v", tc.desc, err) + } else if !tc.shouldParse && err == nil { + t.Errorf("%s - expected parse error but succeeded", tc.desc) + } + + if err == nil { + for key, value := range data { + t.Logf("%s: %v (type: %T)", key, value, value) + } + } + }) + } +}