Squashed commit of the following:

commit e15edaedd786254cf22c291a63a02f7cff33b119
Author: sneak <sneak@sneak.berlin>
Date:   Mon Jul 21 15:17:12 2025 +0200

    Implement YAML-first interpolation with type preservation

    Previously, interpolations were performed during string manipulation before
    YAML parsing, which caused issues with quoting and escaping. This commit
    fundamentally changes the approach:

    - Parse YAML first, then walk the structure to interpolate values
    - Preserve types: standalone interpolations can return numbers/booleans
    - Mixed content (text with embedded interpolations) always returns strings
    - Users control types through YAML syntax, not our quoting logic
    - Properly handle nested interpolations without quote accumulation

    This gives users explicit control over output types while eliminating
    the complex and error-prone manual quoting logic.
This commit is contained in:
Jeffrey Paul 2025-07-21 15:19:28 +02:00
parent 3c3732f033
commit cf13b275b7
6 changed files with 634 additions and 55 deletions

61
DESIGN.md Normal file
View File

@ -0,0 +1,61 @@
# Design: YAML-First Interpolation
## Problem Statement
The current implementation performs string-based interpolation on the raw YAML file content before parsing. This approach has several issues:
1. **Quoting complexity**: We must handle YAML quoting rules manually, leading to edge cases and bugs
2. **Type loss**: All interpolated values become strings, even if they should be numbers or booleans
3. **YAML special values**: Values like "no", "yes", "true", "false", "on", "off" require special handling
4. **Nested interpolation quoting**: Complex logic needed to avoid double-quoting in nested interpolations
5. **Fragile string manipulation**: Operating on raw YAML text is error-prone
## Proposed Solution
Parse the YAML file first, then walk the parsed structure to perform interpolations. This approach:
1. **Leverages YAML parser**: Let the YAML library handle all parsing, quoting, and escaping
2. **Preserves types**: Can return actual numbers, booleans, etc. from interpolations
3. **Simplifies logic**: No string manipulation or quoting logic needed
4. **Clear semantics**: Users control types through YAML syntax (quoted vs unquoted)
5. **Robust**: YAML serializer handles all edge cases correctly
## Implementation Details
### Constraints
- Interpolation only occurs in YAML values, not keys
- Users must properly quote interpolations if they want string output
- The entire file must be valid YAML before interpolation
### Workflow
1. Parse YAML file into a data structure (map/slice/scalar)
2. Recursively walk the structure:
- For string values: check for `${...}` patterns and interpolate
- For other types: pass through unchanged
3. Handle the special `env` key to inject environment variables
4. Return the modified structure
### Type Handling
Users control the output type through YAML syntax:
```yaml
# String output (user quotes the interpolation)
port: "${ENV:PORT}" # Result: "8080" (string)
# Numeric output (no quotes)
port: ${ENV:PORT} # Result: 8080 (number)
# Boolean output
enabled: ${ENV:ENABLED} # Result: true/false (boolean)
```
### Benefits
1. **Correctness**: YAML parser handles all edge cases
2. **Type safety**: Preserves intended types
3. **Simplicity**: Removes complex string manipulation
4. **Predictability**: Users have explicit control over types
5. **Maintainability**: Less code, fewer edge cases

View File

@ -31,45 +31,49 @@ to the environment of the process that reads the config file. This allows a
config file to serve as a bridge between fancy backend secret management
services and "traditional" configuration via env vars.
## Important: Automatic Quoting
## Type-Preserving Interpolation
All interpolated values are automatically quoted and escaped for YAML safety.
This means you should NOT wrap interpolations in quotes:
smartconfig uses a YAML-first approach: it parses the YAML file first, then
walks the structure to perform interpolations. This means:
1. **Type preservation**: Interpolated values can return appropriate types (numbers, booleans, strings)
2. **User control**: The YAML syntax determines the output type
3. **Clean design**: The YAML parser handles all quoting and escaping
### How It Works
When an interpolation is the entire value (not embedded in other text), smartconfig
attempts to convert the resolved value to the appropriate type:
```yaml
# Correct - no quotes needed
password: ${ENV:DB_PASSWORD}
enabled: ${ENV:FEATURE_ENABLED}
# Incorrect - will result in double-quoted values
password: "${ENV:DB_PASSWORD}"
```
**Note:** ALL interpolated values are output as quoted strings, regardless of
their content. This means:
1. YAML keywords like `no`, `yes`, `true`, `false`, `on`, `off`, etc. will
always be treated as strings, never as boolean values
2. Numbers will always be treated as strings, never as numeric values
```yaml
# If ENV:FEATURE_ENABLED is "true", this will be the string "true", not boolean true
enabled: ${ENV:FEATURE_ENABLED}
# If ENV:DEBUG_MODE is "no", this will be the string "no", not boolean false
debug: ${ENV:DEBUG_MODE}
# If ENV:PORT is "8080", this will be the string "8080", not the number 8080
# Numeric output - if ENV:PORT is "8080", this becomes the number 8080
port: ${ENV:PORT}
# If ENV:TIMEOUT is "30.5", this will be the string "30.5", not the float 30.5
timeout: ${ENV:TIMEOUT}
# Boolean output - if ENV:ENABLED is "true", this becomes the boolean true
enabled: ${ENV:ENABLED}
# String output - mixed content always returns strings
message: "Hello ${ENV:NAME}!"
# Force string output - add any prefix/suffix
port_str: "port-${ENV:PORT}"
```
This design choice prevents ambiguity and parsing errors, ensuring consistent
behavior across all interpolated values. Use the typed getter methods (GetInt,
GetFloat, GetBool, etc.) to convert string values to their appropriate types
when accessing configuration values.
### Type Conversion Rules
For standalone interpolations (`value: ${...}`), smartconfig converts:
- `"true"``true` (boolean)
- `"false"``false` (boolean)
- `"123"``123` (integer)
- `"12.5"``12.5` (float)
- Everything else → string
### Important Notes
- YAML parser removes quotes, so `"${ENV:VAR}"` and `${ENV:VAR}` are treated identically
- To force string output, embed the interpolation in other text: `"prefix-${ENV:VAR}"`
- Mixed content (text with embedded interpolations) always returns strings
- Nested interpolations are fully supported: `${ENV:PREFIX_${ENV:SUFFIX}}`
# Usage
@ -220,6 +224,7 @@ if err != nil {
```go
// Returns (int, error)
// Works with both native integers and string values
port, err := config.GetInt("server.port")
if err != nil {
log.Printf("Error: %v", err)
@ -374,10 +379,19 @@ export DB_PASSWORD="secret123"
echo 'password: ${ENV:DB_PASSWORD}' | ./smartconfig
# Output: password: secret123
# JSON output in a pipeline
echo 'password: ${ENV:DB_PASSWORD}' | ./smartconfig --json
# Type preservation example
export PORT="8080"
export ENABLED="true"
echo -e 'port: ${ENV:PORT}\nenabled: ${ENV:ENABLED}' | ./smartconfig
# Output:
# port: 8080 # <-- numeric value
# enabled: true # <-- boolean value
# JSON output preserves types
echo -e 'port: ${ENV:PORT}\nenabled: ${ENV:ENABLED}' | ./smartconfig --json
# Output: {
# "password": "secret123"
# "port": 8080,
# "enabled": true
# }
```

View File

@ -4,7 +4,9 @@ import (
"fmt"
"io"
"os"
"regexp"
"strconv"
"strings"
"github.com/dustin/go-humanize"
"gopkg.in/yaml.v3"
@ -62,16 +64,17 @@ func NewFromReader(reader io.Reader) (*Config, error) {
return nil, fmt.Errorf("failed to read config: %w", err)
}
// Interpolate variables recursively
interpolated, err := c.interpolate(string(data), 0)
// Parse YAML first
if err := yaml.Unmarshal(data, &c.data); err != nil {
return nil, fmt.Errorf("failed to parse YAML: %w", err)
}
// Walk and interpolate the parsed structure
interpolated, err := c.walkAndInterpolate(c.data)
if err != nil {
return nil, fmt.Errorf("failed to interpolate config: %w", err)
}
// Parse as YAML
if err := yaml.Unmarshal([]byte(interpolated), &c.data); err != nil {
return nil, fmt.Errorf("failed to parse YAML: %w", err)
}
c.data = interpolated.(map[string]interface{})
// Handle environment variable injection
if err := c.injectEnvironment(); err != nil {
@ -138,16 +141,17 @@ func (c *Config) LoadFromReader(reader io.Reader) error {
return fmt.Errorf("failed to read config: %w", err)
}
// Interpolate variables recursively
interpolated, err := c.interpolate(string(data), 0)
// Parse YAML first
if err := yaml.Unmarshal(data, &c.data); err != nil {
return fmt.Errorf("failed to parse YAML: %w", err)
}
// Walk and interpolate the parsed structure
interpolated, err := c.walkAndInterpolate(c.data)
if err != nil {
return fmt.Errorf("failed to interpolate config: %w", err)
}
// Parse as YAML
if err := yaml.Unmarshal([]byte(interpolated), &c.data); err != nil {
return fmt.Errorf("failed to parse YAML: %w", err)
}
c.data = interpolated.(map[string]interface{})
// Handle environment variable injection
if err := c.injectEnvironment(); err != nil {
@ -400,3 +404,219 @@ func (c *Config) GetBytes(key string) (uint64, error) {
func (c *Config) Data() map[string]interface{} {
return c.data
}
// walkAndInterpolate recursively walks through the parsed YAML structure
// and interpolates any string values containing ${...} patterns.
func (c *Config) walkAndInterpolate(data interface{}) (interface{}, error) {
switch v := data.(type) {
case string:
// Check if this string contains interpolation patterns
if strings.Contains(v, "${") && strings.Contains(v, "}") {
return c.interpolateString(v)
}
return v, nil
case map[string]interface{}:
// Recursively process map values
result := make(map[string]interface{})
for key, value := range v {
interpolated, err := c.walkAndInterpolate(value)
if err != nil {
return nil, err
}
result[key] = interpolated
}
return result, nil
case []interface{}:
// Recursively process array elements
result := make([]interface{}, len(v))
for i, value := range v {
interpolated, err := c.walkAndInterpolate(value)
if err != nil {
return nil, err
}
result[i] = interpolated
}
return result, nil
default:
// Return other types as-is (numbers, booleans, etc.)
return v, nil
}
}
// interpolateString handles interpolation of a single string value.
// It returns the appropriate type based on the resolved value.
func (c *Config) interpolateString(s string) (interface{}, error) {
// If the entire string is a single interpolation, we can return typed values
if match := regexp.MustCompile(`^\$\{([^:]+):(.+)\}$`).FindStringSubmatch(s); match != nil {
resolverName := match[1]
value := match[2]
// Handle nested interpolations in the value
if strings.Contains(value, "${") {
processedValue, err := c.interpolateValue(value, 0)
if err != nil {
return nil, err
}
value = processedValue
}
// Resolve the value
resolver, ok := c.resolvers[resolverName]
if !ok {
return nil, fmt.Errorf("unknown resolver: %s", resolverName)
}
resolved, err := resolver.Resolve(value)
if err != nil {
return nil, fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err)
}
// Try to convert to appropriate type
return c.convertToType(resolved), nil
}
// Otherwise, it's a string with embedded interpolations
// We need to interpolate these as strings
return c.interpolateMixedContent(s, 0)
}
// interpolateValue handles nested interpolations for resolver values
func (c *Config) interpolateValue(s string, depth int) (string, error) {
if depth >= maxRecursionDepth {
return s, nil
}
result := s
changed := true
for changed && depth < maxRecursionDepth {
changed = false
positions := findInterpolations(result)
// Process from end to beginning
for i := len(positions) - 1; i >= 0; i-- {
pos := positions[i]
fullMatch := result[pos.start:pos.end]
inner := fullMatch[2 : len(fullMatch)-1]
colonIdx := strings.Index(inner, ":")
if colonIdx == -1 {
continue
}
resolverName := inner[:colonIdx]
value := inner[colonIdx+1:]
// Recursively handle nested interpolations
if strings.Contains(value, "${") {
interpolatedValue, err := c.interpolateValue(value, depth+1)
if err != nil {
return "", err
}
value = interpolatedValue
}
// Resolve the value
resolver, ok := c.resolvers[resolverName]
if !ok {
return "", fmt.Errorf("unknown resolver: %s", resolverName)
}
resolved, err := resolver.Resolve(value)
if err != nil {
return "", fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err)
}
// Replace without quotes
result = result[:pos.start] + resolved + result[pos.end:]
changed = true
}
if changed {
depth++
}
}
return result, nil
}
// interpolateMixedContent handles strings with embedded interpolations
func (c *Config) interpolateMixedContent(s string, depth int) (string, error) {
if depth >= maxRecursionDepth {
return s, nil
}
result := s
positions := findInterpolations(result)
// Process from end to beginning
for i := len(positions) - 1; i >= 0; i-- {
pos := positions[i]
fullMatch := result[pos.start:pos.end]
inner := fullMatch[2 : len(fullMatch)-1]
colonIdx := strings.Index(inner, ":")
if colonIdx == -1 {
continue
}
resolverName := inner[:colonIdx]
value := inner[colonIdx+1:]
// Handle nested interpolations
if strings.Contains(value, "${") {
interpolatedValue, err := c.interpolateValue(value, depth+1)
if err != nil {
return "", err
}
value = interpolatedValue
}
// Resolve the value
resolver, ok := c.resolvers[resolverName]
if !ok {
return "", fmt.Errorf("unknown resolver: %s", resolverName)
}
resolved, err := resolver.Resolve(value)
if err != nil {
return "", fmt.Errorf("failed to resolve %s:%s: %w", resolverName, value, err)
}
// Replace without quotes for mixed content
result = result[:pos.start] + resolved + result[pos.end:]
}
return result, nil
}
// convertToType attempts to convert a string to its appropriate type
func (c *Config) convertToType(s string) interface{} {
// Try boolean
if s == "true" {
return true
}
if s == "false" {
return false
}
// Try integer
if i, err := strconv.Atoi(s); err == nil {
return i
}
// Try float
if f, err := strconv.ParseFloat(s, 64); err == nil {
// Check if it's actually an integer
if float64(int(f)) == f {
return int(f)
}
return f
}
// Default to string
return s
}

View File

@ -1,10 +1,13 @@
package smartconfig
import (
"fmt"
"os"
"regexp"
"strings"
"testing"
"gopkg.in/yaml.v3"
)
func TestEnvResolver(t *testing.T) {
@ -313,13 +316,11 @@ func TestRecursionLimit(t *testing.T) {
t.Fatalf("Failed to load config: %v", err)
}
// At depth 3, interpolation stops and returns the partial result
// So LEVEL3_${ENV:LEVEL4} is used as the literal key, giving "depth3value"
// Then LEVEL2_depth3value gives "depth2value"
// Finally LEVEL1_depth2value gives "final_with_limit"
// With the new implementation, nested interpolations are handled before type conversion
// The recursion still works up to the limit, so we should get "final"
value, _ := config.GetString("value")
if value != "final_with_limit" {
t.Errorf("Expected value to be 'final_with_limit' (recursion limit reached), got '%s'", value)
if value != "final" {
t.Errorf("Expected value to be 'final', got '%s'", value)
}
}
@ -357,3 +358,56 @@ func TestUnknownResolver(t *testing.T) {
t.Errorf("Expected error to mention unknown resolver UNKNOWN, got: %v", err)
}
}
func TestYAMLParsing(t *testing.T) {
testCases := []struct {
name string
yaml string
expected interface{}
wantErr bool
}{
{
name: "interpolation with special chars",
yaml: `keyname: ${EXEC:"blahblah < /one/two/three%"}`,
expected: `${EXEC:"blahblah < /one/two/three%"}`,
wantErr: false,
},
{
name: "unquoted interpolation",
yaml: `keyname: ${EXEC:echo hello}`,
expected: `${EXEC:echo hello}`,
wantErr: false,
},
{
name: "numeric value",
yaml: `port: 8080`,
expected: 8080,
wantErr: false,
},
{
name: "boolean value",
yaml: `enabled: true`,
expected: true,
wantErr: false,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
var data map[string]interface{}
err := yaml.Unmarshal([]byte(tc.yaml), &data)
if (err != nil) != tc.wantErr {
t.Errorf("Unmarshal error = %v, wantErr %v", err, tc.wantErr)
return
}
if err == nil {
for key, value := range data {
t.Logf("%s: %v (type: %T)", key, value, value)
if fmt.Sprintf("%v", value) != fmt.Sprintf("%v", tc.expected) {
t.Errorf("Expected %v, got %v", tc.expected, value)
}
}
}
})
}
}

144
typed_interpolation_test.go Normal file
View File

@ -0,0 +1,144 @@
package smartconfig
import (
"os"
"strings"
"testing"
)
func TestTypedInterpolation(t *testing.T) {
testCases := []struct {
name string
yaml string
env map[string]string
checks []struct {
key string
expectedVal interface{}
expectedType string
}
}{
{
name: "unquoted number interpolation",
yaml: `port: ${ENV:PORT}`,
env: map[string]string{"PORT": "8080"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "port", expectedVal: 8080, expectedType: "int"},
},
},
{
name: "force string with prefix",
yaml: `port: "port-${ENV:PORT}"`,
env: map[string]string{"PORT": "8080"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "port", expectedVal: "port-8080", expectedType: "string"},
},
},
{
name: "boolean interpolation",
yaml: `enabled: ${ENV:ENABLED}`,
env: map[string]string{"ENABLED": "true"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "enabled", expectedVal: true, expectedType: "bool"},
},
},
{
name: "float interpolation",
yaml: `timeout: ${ENV:TIMEOUT}`,
env: map[string]string{"TIMEOUT": "30.5"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "timeout", expectedVal: 30.5, expectedType: "float64"},
},
},
{
name: "mixed content stays string",
yaml: `message: "Hello ${ENV:NAME}!"`,
env: map[string]string{"NAME": "World"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "message", expectedVal: "Hello World!", expectedType: "string"},
},
},
{
name: "nested interpolation with type",
yaml: `value: ${ENV:PREFIX_${ENV:SUFFIX}}`,
env: map[string]string{"SUFFIX": "VAL", "PREFIX_VAL": "42"},
checks: []struct {
key string
expectedVal interface{}
expectedType string
}{
{key: "value", expectedVal: 42, expectedType: "int"},
},
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
// Set environment variables
for k, v := range tc.env {
_ = os.Setenv(k, v)
defer func(key string) { _ = os.Unsetenv(key) }(k)
}
// Load config
config, err := NewFromReader(strings.NewReader(tc.yaml))
if err != nil {
t.Fatalf("Failed to load config: %v", err)
}
// Check values and types
for _, check := range tc.checks {
val, exists := config.Get(check.key)
if !exists {
t.Errorf("Key %s not found", check.key)
continue
}
// Check type
actualType := ""
switch val.(type) {
case string:
actualType = "string"
case int:
actualType = "int"
case float64:
actualType = "float64"
case bool:
actualType = "bool"
default:
actualType = "unknown"
}
if actualType != check.expectedType {
t.Errorf("Key %s: expected type %s, got %s (value: %v)",
check.key, check.expectedType, actualType, val)
}
// Check value
if val != check.expectedVal {
t.Errorf("Key %s: expected value %v, got %v",
check.key, check.expectedVal, val)
}
}
})
}
}

86
yaml_syntax_test.go Normal file
View File

@ -0,0 +1,86 @@
// This test file verifies that YAML syntax allows unquoted interpolation markers.
//
// Key findings:
// - YAML treats unquoted ${...} expressions as string values
// - Users can write interpolations without quotes (e.g., port: ${ENV:PORT})
// - Even complex shell commands with special characters work unquoted
// - This confirms our design approach: parse YAML first, then walk string values
// looking for interpolation patterns
//
// This means users have explicit control over types:
// - port: ${ENV:PORT} # We can return a number if ENV:PORT="8080"
// - port: "${ENV:PORT}" # Always returns a string
// - enabled: ${ENV:ENABLED} # We can return a boolean if ENV:ENABLED="true"
package smartconfig
import (
"testing"
"gopkg.in/yaml.v3"
)
func TestYAMLInterpolationSyntax(t *testing.T) {
testCases := []struct {
name string
yaml string
shouldParse bool
desc string
}{
{
name: "unquoted interpolation for number",
yaml: `port: ${ENV:PORT}`,
shouldParse: true,
desc: "Users should be able to use interpolation without quotes to get numbers",
},
{
name: "unquoted interpolation for boolean",
yaml: `enabled: ${ENV:ENABLED}`,
shouldParse: true,
desc: "Users should be able to use interpolation without quotes to get booleans",
},
{
name: "interpolation with special shell chars",
yaml: `command: ${EXEC:cat /etc/hosts | grep localhost}`,
shouldParse: true,
desc: "Special shell characters should work unquoted",
},
{
name: "interpolation with redirect",
yaml: `output: ${EXEC:echo hello > /tmp/test}`,
shouldParse: true,
desc: "Shell redirects should work unquoted",
},
{
name: "complex exec with special chars",
yaml: `data: ${EXEC:awk '{print $1}' /etc/passwd}`,
shouldParse: true, // YAML treats the whole expression as a string
desc: "Complex commands with quotes work unquoted",
},
{
name: "nested interpolation unquoted",
yaml: `value: ${ENV:PREFIX_${ENV:SUFFIX}}`,
shouldParse: true,
desc: "Nested interpolations should work unquoted",
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
var data map[string]interface{}
err := yaml.Unmarshal([]byte(tc.yaml), &data)
if tc.shouldParse && err != nil {
t.Errorf("%s - expected to parse but got error: %v", tc.desc, err)
} else if !tc.shouldParse && err == nil {
t.Errorf("%s - expected parse error but succeeded", tc.desc)
}
if err == nil {
for key, value := range data {
t.Logf("%s: %v (type: %T)", key, value, value)
}
}
})
}
}