Learn how to combine DataFrames using various merge strategies in GPandas, similar to SQL JOIN operations.

 

Overview

The Merge() function combines two DataFrames based on a common column (key). GPandas supports four merge types:

Merge TypeConstantDescription
InnerInnerMergeOnly matching rows from both DataFrames
LeftLeftMergeAll rows from left, matching from right
RightRightMergeAll rows from right, matching from left
Full OuterFullMergeAll rows from both DataFrames

 

Function Signature

func (df *DataFrame) Merge(other *DataFrame, on string, how MergeHow) (*DataFrame, error)

 

Parameters

ParameterTypeDescription
other*DataFrameThe right DataFrame to merge with
onstringColumn name to join on (must exist in both DataFrames)
howMergeHowType of merge: InnerMerge, LeftMerge, RightMerge, FullMerge

 

Returns

TypeDescription
*DataFrameNew merged DataFrame
errorError if merge fails

 


 

Sample Data

All examples use these two DataFrames:

Left DataFrame (df1) - Employees

IDNameDepartment
1AliceEngineering
2BobMarketing
3CharlieEngineering
4DianaSales

 

Right DataFrame (df2) - Salaries

IDSalaryBonus
1850005000
2720003000
5950008000
6680002000

 

Setup Code

package main

import (
    "fmt"
    "log"

    "github.com/apoplexi24/gpandas"
    "github.com/apoplexi24/gpandas/dataframe"
)

func main() {
    gp := gpandas.GoPandas{}
    
    // Create employees DataFrame
    df1, _ := gp.DataFrame(
        []string{"ID", "Name", "Department"},
        []gpandas.Column{
            {int64(1), int64(2), int64(3), int64(4)},
            {"Alice", "Bob", "Charlie", "Diana"},
            {"Engineering", "Marketing", "Engineering", "Sales"},
        },
        map[string]any{
            "ID":         gpandas.IntCol{},
            "Name":       gpandas.StringCol{},
            "Department": gpandas.StringCol{},
        },
    )
    
    // Create salaries DataFrame
    df2, _ := gp.DataFrame(
        []string{"ID", "Salary", "Bonus"},
        []gpandas.Column{
            {int64(1), int64(2), int64(5), int64(6)},
            {85000.0, 72000.0, 95000.0, 68000.0},
            {5000.0, 3000.0, 8000.0, 2000.0},
        },
        map[string]any{
            "ID":     gpandas.IntCol{},
            "Salary": gpandas.FloatCol{},
            "Bonus":  gpandas.FloatCol{},
        },
    )
    
    // Examples follow...
}

 


 

Inner Merge

Returns only rows where the key exists in both DataFrames.

flowchart LR
    subgraph Left["Left DataFrame"]
        L1[ID: 1 - Alice]
        L2[ID: 2 - Bob]
        L3[ID: 3 - Charlie]
        L4[ID: 4 - Diana]
    end
    
    subgraph Right["Right DataFrame"]
        R1[ID: 1 - $85k]
        R2[ID: 2 - $72k]
        R3[ID: 5 - $95k]
        R4[ID: 6 - $68k]
    end
    
    subgraph Result["Inner Merge Result"]
        M1[ID: 1 - Alice - $85k]
        M2[ID: 2 - Bob - $72k]
    end
    
    L1 -->|match| M1
    L2 -->|match| M2
    L3 -.->|no match| X1[excluded]
    L4 -.->|no match| X2[excluded]
    R1 -->|match| M1
    R2 -->|match| M2
    R3 -.->|no match| X3[excluded]
    R4 -.->|no match| X4[excluded]
    
    style Left fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Right fill:#1e293b,stroke:#22c55e,stroke-width:2px
    style Result fill:#1e293b,stroke:#f59e0b,stroke-width:2px

 

Inner Merge Example

result, err := df1.Merge(df2, "ID", dataframe.InnerMerge)
if err != nil {
    log.Fatalf("Merge failed: %v", err)
}
fmt.Println(result.String())

 

Inner Merge Output

+----+-------+-------------+--------+-------+
| ID | Name  | Department  | Salary | Bonus |
+----+-------+-------------+--------+-------+
| 1  | Alice | Engineering | 85000  | 5000  |
| 2  | Bob   | Marketing   | 72000  | 3000  |
+----+-------+-------------+--------+-------+
[2 rows x 5 columns]

 


 

Left Merge

Returns all rows from the left DataFrame, with matching data from the right. Non-matching rows have nil for right columns.

flowchart LR
    subgraph Left["Left DataFrame"]
        L1[ID: 1 - Alice]
        L2[ID: 2 - Bob]
        L3[ID: 3 - Charlie]
        L4[ID: 4 - Diana]
    end
    
    subgraph Right["Right DataFrame"]
        R1[ID: 1 - $85k]
        R2[ID: 2 - $72k]
        R3[ID: 5 - $95k]
        R4[ID: 6 - $68k]
    end
    
    subgraph Result["Left Merge Result"]
        M1[ID: 1 - Alice - $85k]
        M2[ID: 2 - Bob - $72k]
        M3[ID: 3 - Charlie - nil]
        M4[ID: 4 - Diana - nil]
    end
    
    L1 -->|match| M1
    L2 -->|match| M2
    L3 -->|kept| M3
    L4 -->|kept| M4
    R1 -->|match| M1
    R2 -->|match| M2
    R3 -.->|excluded| X1[not used]
    R4 -.->|excluded| X2[not used]
    
    style Left fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Right fill:#1e293b,stroke:#22c55e,stroke-width:2px
    style Result fill:#1e293b,stroke:#f59e0b,stroke-width:2px

 

Left Merge Example

result, err := df1.Merge(df2, "ID", dataframe.LeftMerge)
if err != nil {
    log.Fatalf("Merge failed: %v", err)
}
fmt.Println(result.String())

 

Left Merge Output

+----+---------+-------------+--------+-------+
| ID | Name    | Department  | Salary | Bonus |
+----+---------+-------------+--------+-------+
| 1  | Alice   | Engineering | 85000  | 5000  |
| 2  | Bob     | Marketing   | 72000  | 3000  |
| 3  | Charlie | Engineering | <nil>  | <nil> |
| 4  | Diana   | Sales       | <nil>  | <nil> |
+----+---------+-------------+--------+-------+
[4 rows x 5 columns]

 


 

Right Merge

Returns all rows from the right DataFrame, with matching data from the left. Non-matching rows have nil for left columns.

flowchart LR
    subgraph Left["Left DataFrame"]
        L1[ID: 1 - Alice]
        L2[ID: 2 - Bob]
        L3[ID: 3 - Charlie]
        L4[ID: 4 - Diana]
    end
    
    subgraph Right["Right DataFrame"]
        R1[ID: 1 - $85k]
        R2[ID: 2 - $72k]
        R3[ID: 5 - $95k]
        R4[ID: 6 - $68k]
    end
    
    subgraph Result["Right Merge Result"]
        M1[ID: 1 - Alice - $85k]
        M2[ID: 2 - Bob - $72k]
        M3[ID: 5 - nil - $95k]
        M4[ID: 6 - nil - $68k]
    end
    
    L1 -->|match| M1
    L2 -->|match| M2
    L3 -.->|excluded| X1[not used]
    L4 -.->|excluded| X2[not used]
    R1 -->|match| M1
    R2 -->|match| M2
    R3 -->|kept| M3
    R4 -->|kept| M4
    
    style Left fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Right fill:#1e293b,stroke:#22c55e,stroke-width:2px
    style Result fill:#1e293b,stroke:#f59e0b,stroke-width:2px

 

Right Merge Example

result, err := df1.Merge(df2, "ID", dataframe.RightMerge)
if err != nil {
    log.Fatalf("Merge failed: %v", err)
}
fmt.Println(result.String())

 

Right Merge Output

+----+-------+-------------+--------+-------+
| ID | Name  | Department  | Salary | Bonus |
+----+-------+-------------+--------+-------+
| 1  | Alice | Engineering | 85000  | 5000  |
| 2  | Bob   | Marketing   | 72000  | 3000  |
| 5  | <nil> | <nil>       | 95000  | 8000  |
| 6  | <nil> | <nil>       | 68000  | 2000  |
+----+-------+-------------+--------+-------+
[4 rows x 5 columns]

 


 

Full Outer Merge

Returns all rows from both DataFrames. Non-matching rows have nil for missing columns.

flowchart LR
    subgraph Left["Left DataFrame"]
        L1[ID: 1 - Alice]
        L2[ID: 2 - Bob]
        L3[ID: 3 - Charlie]
        L4[ID: 4 - Diana]
    end
    
    subgraph Right["Right DataFrame"]
        R1[ID: 1 - $85k]
        R2[ID: 2 - $72k]
        R3[ID: 5 - $95k]
        R4[ID: 6 - $68k]
    end
    
    subgraph Result["Full Merge Result"]
        M1[ID: 1 - Alice - $85k]
        M2[ID: 2 - Bob - $72k]
        M3[ID: 3 - Charlie - nil]
        M4[ID: 4 - Diana - nil]
        M5[ID: 5 - nil - $95k]
        M6[ID: 6 - nil - $68k]
    end
    
    L1 --> M1
    L2 --> M2
    L3 --> M3
    L4 --> M4
    R1 --> M1
    R2 --> M2
    R3 --> M5
    R4 --> M6
    
    style Left fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Right fill:#1e293b,stroke:#22c55e,stroke-width:2px
    style Result fill:#1e293b,stroke:#f59e0b,stroke-width:2px

 

Full Merge Example

result, err := df1.Merge(df2, "ID", dataframe.FullMerge)
if err != nil {
    log.Fatalf("Merge failed: %v", err)
}
fmt.Println(result.String())

 

Full Merge Output

+----+---------+-------------+--------+-------+
| ID | Name    | Department  | Salary | Bonus |
+----+---------+-------------+--------+-------+
| 1  | Alice   | Engineering | 85000  | 5000  |
| 2  | Bob     | Marketing   | 72000  | 3000  |
| 3  | Charlie | Engineering | <nil>  | <nil> |
| 4  | Diana   | Sales       | <nil>  | <nil> |
| 5  | <nil>   | <nil>       | 95000  | 8000  |
| 6  | <nil>   | <nil>       | 68000  | 2000  |
+----+---------+-------------+--------+-------+
[6 rows x 5 columns]

 


 

Merge Type Comparison

Visual comparison of all merge types:

graph TD
    subgraph Comparison["Merge Type Comparison"]
        subgraph Inner["Inner Merge"]
            I[Only matching<br/>rows from both]
        end
        
        subgraph Left["Left Merge"]
            L[All left rows +<br/>matching right]
        end
        
        subgraph Right["Right Merge"]
            R[All right rows +<br/>matching left]
        end
        
        subgraph Full["Full Merge"]
            F[All rows<br/>from both]
        end
    end
    
    style Inner fill:#ef4444,stroke:#dc2626,stroke-width:2px
    style Left fill:#3b82f6,stroke:#2563eb,stroke-width:2px
    style Right fill:#22c55e,stroke:#16a34a,stroke-width:2px
    style Full fill:#f59e0b,stroke:#d97706,stroke-width:2px

 

Merge TypeLeft RowsRight RowsResult Rows (Example)
InnerMatching onlyMatching only2
LeftAll (4)Matching only4
RightMatching onlyAll (4)4
FullAll (4)All (4)6

 


 

SQL Equivalent

GPandasSQL Equivalent
InnerMergeINNER JOIN
LeftMergeLEFT OUTER JOIN
RightMergeRIGHT OUTER JOIN
FullMergeFULL OUTER JOIN
-- Inner Merge equivalent
SELECT * FROM employees e
INNER JOIN salaries s ON e.ID = s.ID;

-- Left Merge equivalent
SELECT * FROM employees e
LEFT OUTER JOIN salaries s ON e.ID = s.ID;

-- Right Merge equivalent
SELECT * FROM employees e
RIGHT OUTER JOIN salaries s ON e.ID = s.ID;

-- Full Merge equivalent
SELECT * FROM employees e
FULL OUTER JOIN salaries s ON e.ID = s.ID;

 


 

Handling Many-to-Many Relationships

When the key column has duplicate values, the merge produces a Cartesian product of matching rows:

// df1 has duplicate IDs
// ID: 1, 1, 2
// Name: Alice, Alex, Bob

// df2 has duplicate IDs  
// ID: 1, 1, 2
// Salary: 80k, 85k, 72k

// Inner merge produces:
// ID: 1, 1, 1, 1, 2
// Name: Alice, Alice, Alex, Alex, Bob
// Salary: 80k, 85k, 80k, 85k, 72k

 


 

Error Handling

Common Errors

ErrorCauseSolution
“both DataFrames must be non-nil”nil DataFrame passedCheck DataFrame initialization
“column ‘X’ not found in left DataFrame”Key column missingVerify column name
“column ‘X’ not found in right DataFrame”Key column missingVerify column name
“invalid merge type”Invalid MergeHow valueUse predefined constants

 

Example Error Handling

result, err := df1.Merge(df2, "ID", dataframe.InnerMerge)
if err != nil {
    switch {
    case strings.Contains(err.Error(), "non-nil"):
        log.Fatal("One of the DataFrames is nil")
    case strings.Contains(err.Error(), "not found"):
        log.Fatal("Key column doesn't exist in one of the DataFrames")
    default:
        log.Fatalf("Merge error: %v", err)
    }
}

 


 

Complete Example: Multi-Table Join

package main

import (
    "fmt"
    "log"

    "github.com/apoplexi24/gpandas"
    "github.com/apoplexi24/gpandas/dataframe"
)

func main() {
    gp := gpandas.GoPandas{}
    
    // Load three related tables
    employees, _ := gp.Read_csv("employees.csv")       // ID, Name, DeptID
    departments, _ := gp.Read_csv("departments.csv")   // DeptID, DeptName
    salaries, _ := gp.Read_csv("salaries.csv")         // ID, Salary
    
    // First merge: employees + departments
    empDept, err := employees.Merge(departments, "DeptID", dataframe.LeftMerge)
    if err != nil {
        log.Fatalf("First merge failed: %v", err)
    }
    fmt.Println("Employees with Departments:")
    fmt.Println(empDept.String())
    
    // Second merge: result + salaries
    final, err := empDept.Merge(salaries, "ID", dataframe.LeftMerge)
    if err != nil {
        log.Fatalf("Second merge failed: %v", err)
    }
    fmt.Println("\nComplete Employee Data:")
    fmt.Println(final.String())
    
    // Export final result
    _, err = final.ToCSV("complete_employee_data.csv", ",")
    if err != nil {
        log.Printf("Export warning: %v", err)
    }
}

 

See Also