Learn how to sort DataFrames by column values or index labels in GPandas, with support for multi-column sorting, null handling, and in-place modifications.

 

Overview

GPandas provides two sorting methods:

OperationMethodDescription
Sort by ValuesSortValues()Sort by one or more column values
Sort by IndexSortIndex()Sort by index labels

 


 

SortValues

Sorts the DataFrame by the values in one or more columns, similar to pandas’ df.sort_values().

 

Function Signature

func (df *DataFrame) SortValues(opts SortOptions) (*DataFrame, error)

 

SortOptions

FieldTypeDescriptionDefault
By[]stringColumn names to sort byRequired
Ascending[]boolSort order for each columnAll true
NaPositionNaPositionWhere to place null valuesNaLast
InplaceboolModify DataFrame in placefalse
IgnoreIndexboolReset index after sortingfalse

 

NaPosition Constants

ConstantDescription
NaLastPlace null values at the end (default)
NaFirstPlace null values at the beginning

 

Supported Types

TypeComparison
float64Numeric comparison
int64Numeric comparison
stringLexicographic comparison
boolfalse < true

 


 

Sample Data

All examples use this employee DataFrame:

Employees DataFrame

NameDepartmentAgeSalary
AliceEngineering3095000
BobSales2555000
CharlieEngineering35105000
DianaSales2862000
EveMarketing3272000
FrankEngineering2788000

 

Setup Code

package main

import (
    "fmt"
    "log"

    "github.com/apoplexi24/gpandas"
    "github.com/apoplexi24/gpandas/dataframe"
)

func main() {
    gp := gpandas.GoPandas{}
    
    // Create employee DataFrame
    df, _ := gp.DataFrame(
        []string{"Name", "Department", "Age", "Salary"},
        []gpandas.Column{
            {"Alice", "Bob", "Charlie", "Diana", "Eve", "Frank"},
            {"Engineering", "Sales", "Engineering", "Sales", "Marketing", "Engineering"},
            {int64(30), int64(25), int64(35), int64(28), int64(32), int64(27)},
            {95000.0, 55000.0, 105000.0, 62000.0, 72000.0, 88000.0},
        },
        map[string]any{
            "Name":       gpandas.StringCol{},
            "Department": gpandas.StringCol{},
            "Age":        gpandas.IntCol{},
            "Salary":     gpandas.FloatCol{},
        },
    )
    
    // Examples follow...
}

 


 

Single Column Sort (Ascending)

Sort by a single column in ascending order:

flowchart LR
    subgraph Original["Original DataFrame"]
        O1[Salary: 95000]
        O2[Salary: 55000]
        O3[Salary: 105000]
        O4[Salary: 62000]
    end
    
    subgraph Sort["SortValues Operation"]
        OP[Sort by Salary<br/>Ascending]
    end
    
    subgraph Result["Sorted DataFrame"]
        R1[Salary: 55000]
        R2[Salary: 62000]
        R3[Salary: 88000]
        R4[Salary: 95000]
    end
    
    O1 --> OP
    O2 --> OP
    O3 --> OP
    O4 --> OP
    OP --> R1
    OP --> R2
    OP --> R3
    OP --> R4
    
    style Original fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Sort fill:#1e293b,stroke:#f59e0b,stroke-width:2px
    style Result fill:#1e293b,stroke:#22c55e,stroke-width:2px

 

Example

sorted, err := df.SortValues(dataframe.SortOptions{
    By: []string{"Salary"},
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}
fmt.Println(sorted.String())

 

Output

+------+-------------+-----+--------+
| Name | Department  | Age | Salary |
+------+-------------+-----+--------+
| Bob  | Sales       | 25  | 55000  |
| Diana| Sales       | 28  | 62000  |
| Eve  | Marketing   | 32  | 72000  |
| Frank| Engineering | 27  | 88000  |
| Alice| Engineering | 30  | 95000  |
| Charlie | Engineering | 35 | 105000 |
+------+-------------+-----+--------+
[6 rows x 4 columns]

 


 

Single Column Sort (Descending)

Sort by a single column in descending order:

sorted, err := df.SortValues(dataframe.SortOptions{
    By:        []string{"Age"},
    Ascending: []bool{false},
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}
fmt.Println(sorted.String())

 

Output

+---------+-------------+-----+--------+
| Name    | Department  | Age | Salary |
+---------+-------------+-----+--------+
| Charlie | Engineering | 35  | 105000 |
| Eve     | Marketing   | 32  | 72000  |
| Alice   | Engineering | 30  | 95000  |
| Diana   | Sales       | 28  | 62000  |
| Frank   | Engineering | 27  | 88000  |
| Bob     | Sales       | 25  | 55000  |
+---------+-------------+-----+--------+
[6 rows x 4 columns]

 


 

Multi-Column Sort

Sort by multiple columns with independent sort orders:

sorted, err := df.SortValues(dataframe.SortOptions{
    By:        []string{"Department", "Salary"},
    Ascending: []bool{true, false},  // Dept ascending, Salary descending
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}
fmt.Println(sorted.String())

 

Output

+---------+-------------+-----+--------+
| Name    | Department  | Age | Salary |
+---------+-------------+-----+--------+
| Charlie | Engineering | 35  | 105000 |
| Alice   | Engineering | 30  | 95000  |
| Frank   | Engineering | 27  | 88000  |
| Eve     | Marketing   | 32  | 72000  |
| Diana   | Sales       | 28  | 62000  |
| Bob     | Sales       | 25  | 55000  |
+---------+-------------+-----+--------+
[6 rows x 4 columns]

 

Multi-Column Sort Flow

flowchart TD
    subgraph Input["Original Data"]
        I[Mixed departments<br/>and salaries]
    end
    
    subgraph Primary["Primary Sort"]
        P[Sort by Department<br/>ascending]
    end
    
    subgraph Secondary["Secondary Sort"]
        S[Within each department<br/>sort by Salary descending]
    end
    
    subgraph Result["Sorted Result"]
        R[Engineering: 105k, 95k, 88k<br/>Marketing: 72k<br/>Sales: 62k, 55k]
    end
    
    I --> P
    P --> S
    S --> R
    
    style Input fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Primary fill:#1e293b,stroke:#f59e0b,stroke-width:2px
    style Secondary fill:#1e293b,stroke:#8b5cf6,stroke-width:2px
    style Result fill:#1e293b,stroke:#22c55e,stroke-width:2px

 


 

Ascending Options

The Ascending field provides flexible sort order control:

ConfigurationBehavior
Empty []bool{}All columns ascending
Single value []bool{false}Applies to all columns
Multiple valuesMust match length of By

 

Examples

// All ascending (default)
sorted, _ := df.SortValues(dataframe.SortOptions{
    By: []string{"Age", "Salary"},
    // Ascending not specified - defaults to all true
})

// All descending
sorted, _ := df.SortValues(dataframe.SortOptions{
    By:        []string{"Age", "Salary"},
    Ascending: []bool{false},  // Single value applies to all
})

// Mixed order
sorted, _ := df.SortValues(dataframe.SortOptions{
    By:        []string{"Age", "Salary"},
    Ascending: []bool{true, false},  // Age asc, Salary desc
})

 


 

Null Handling

Control where null values appear in sorted results:

// Nulls at the end (default)
sorted, _ := df.SortValues(dataframe.SortOptions{
    By:         []string{"Score"},
    NaPosition: dataframe.NaLast,
})

// Nulls at the beginning
sorted, _ := df.SortValues(dataframe.SortOptions{
    By:         []string{"Score"},
    NaPosition: dataframe.NaFirst,
})

 

Null Positioning

NaPositionAscending SortDescending Sort
NaLast1, 2, 3, nil3, 2, 1, nil
NaFirstnil, 1, 2, 3nil, 3, 2, 1

 


 

In-Place Sorting

Modify the DataFrame directly instead of creating a copy:

// Sort in place (modifies original DataFrame)
_, err := df.SortValues(dataframe.SortOptions{
    By:      []string{"Salary"},
    Inplace: true,
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}

// df is now sorted, no new DataFrame created
fmt.Println(df.String())

 

In-Place vs Copy

OptionReturnsOriginal DataFrameUse Case
Inplace: falseNew sorted DataFrameUnchangedNeed both versions
Inplace: truenilModifiedSave memory

 


 

Index Reset

Reset the index to sequential integers after sorting:

sorted, err := df.SortValues(dataframe.SortOptions{
    By:          []string{"Salary"},
    Ascending:   []bool{false},
    IgnoreIndex: true,
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}
fmt.Println(sorted.String())

 

Without IgnoreIndex

+---------+-------------+-----+--------+
| Index   | Name        | Age | Salary |
+---------+-------------+-----+--------+
| 2       | Charlie     | 35  | 105000 |
| 0       | Alice       | 30  | 95000  |
| 5       | Frank       | 27  | 88000  |
+---------+-------------+-----+--------+

 

With IgnoreIndex

+---------+-------------+-----+--------+
| Index   | Name        | Age | Salary |
+---------+-------------+-----+--------+
| 0       | Charlie     | 35  | 105000 |
| 1       | Alice       | 30  | 95000  |
| 2       | Frank       | 27  | 88000  |
+---------+-------------+-----+--------+

 


 

SortIndex

Sorts the DataFrame by its index labels in lexicographic order.

 

Function Signature

func (df *DataFrame) SortIndex(ascending bool) (*DataFrame, error)

 

Parameters

ParameterTypeDescription
ascendingbooltrue for ascending, false for descending

 

Example

// Set custom index labels
err := df.SetIndex([]string{"f", "b", "e", "a", "d", "c"})
if err != nil {
    log.Fatalf("SetIndex failed: %v", err)
}

fmt.Println("Before sorting by index:")
fmt.Println(df.String())

// Sort by index (ascending)
sorted, err := df.SortIndex(true)
if err != nil {
    log.Fatalf("SortIndex failed: %v", err)
}

fmt.Println("\nAfter sorting by index:")
fmt.Println(sorted.String())

 

Output

Before sorting by index:
+-------+---------+-------------+-----+--------+
| Index | Name    | Department  | Age | Salary |
+-------+---------+-------------+-----+--------+
| f     | Alice   | Engineering | 30  | 95000  |
| b     | Bob     | Sales       | 25  | 55000  |
| e     | Charlie | Engineering | 35  | 105000 |
+-------+---------+-------------+-----+--------+

After sorting by index:
+-------+---------+-------------+-----+--------+
| Index | Name    | Department  | Age | Salary |
+-------+---------+-------------+-----+--------+
| a     | Diana   | Sales       | 28  | 62000  |
| b     | Bob     | Sales       | 25  | 55000  |
| c     | Frank   | Engineering | 27  | 88000  |
| e     | Charlie | Engineering | 35  | 105000 |
| f     | Alice   | Engineering | 30  | 95000  |
+-------+---------+-------------+-----+--------+

 


 

Sort Stability

GPandas uses stable sorting, preserving the relative order of equal elements:

// DataFrame with duplicate ages
// Name: Alice, Bob, Charlie, Diana
// Age:  30,   30,  25,      25

sorted, _ := df.SortValues(dataframe.SortOptions{
    By: []string{"Age"},
})

// Result preserves original order for equal ages:
// Name: Charlie, Diana, Alice, Bob
// Age:  25,      25,    30,    30

 


 

Alphabetical Sort

Sort string columns alphabetically:

sorted, err := df.SortValues(dataframe.SortOptions{
    By: []string{"Name"},
})
if err != nil {
    log.Fatalf("Sort failed: %v", err)
}
fmt.Println(sorted.String())

 

Output

+---------+-------------+-----+--------+
| Name    | Department  | Age | Salary |
+---------+-------------+-----+--------+
| Alice   | Engineering | 30  | 95000  |
| Bob     | Sales       | 25  | 55000  |
| Charlie | Engineering | 35  | 105000 |
| Diana   | Sales       | 28  | 62000  |
| Eve     | Marketing   | 32  | 72000  |
| Frank   | Engineering | 27  | 88000  |
+---------+-------------+-----+--------+
[6 rows x 4 columns]

 


 

Sorting Workflow

flowchart TD
    subgraph Input["Original DataFrame"]
        I[Unsorted data]
    end
    
    subgraph Validate["Validation"]
        V1[Check columns exist]
        V2[Validate options]
    end
    
    subgraph Process["Sort Process"]
        P1[Extract column values]
        P2[Build sort indices]
        P3[Apply stable sort]
    end
    
    subgraph Output["Result"]
        O1{Inplace?}
        O2[Modify original]
        O3[Return new DataFrame]
    end
    
    I --> V1
    V1 --> V2
    V2 --> P1
    P1 --> P2
    P2 --> P3
    P3 --> O1
    O1 -->|Yes| O2
    O1 -->|No| O3
    
    style Input fill:#1e293b,stroke:#3b82f6,stroke-width:2px
    style Validate fill:#1e293b,stroke:#f59e0b,stroke-width:2px
    style Process fill:#1e293b,stroke:#8b5cf6,stroke-width:2px
    style Output fill:#1e293b,stroke:#22c55e,stroke-width:2px

 


 

Error Handling

Common Errors

ErrorCauseSolution
“DataFrame is nil”Operating on nil DataFrameCheck DataFrame initialization
“‘By’ must contain at least one column”Empty By sliceProvide at least one column name
“column ‘X’ not found”Invalid column nameVerify column exists
“length of ‘Ascending’ must match ‘By’”Mismatched lengthsMatch lengths or use single value
“NaPosition must be ’last’ or ‘first’”Invalid NaPositionUse NaLast or NaFirst
“type mismatch: cannot compare”Mixed types in columnEnsure column has consistent types

 

Error Handling Example

sorted, err := df.SortValues(dataframe.SortOptions{
    By:        []string{"Department", "Salary"},
    Ascending: []bool{true, false},
})
if err != nil {
    switch {
    case strings.Contains(err.Error(), "not found"):
        log.Fatal("Column doesn't exist in DataFrame")
    case strings.Contains(err.Error(), "must match"):
        log.Fatal("Ascending slice length mismatch")
    case strings.Contains(err.Error(), "type mismatch"):
        log.Fatal("Cannot compare values of different types")
    default:
        log.Fatalf("Sort error: %v", err)
    }
}

 


 

Thread Safety

Sorting operations are thread-safe:

MethodLock TypeDescription
SortValues()RLockRead lock during value extraction
SortIndex()RLockRead lock during index extraction
In-place sortLockWrite lock when modifying original

 

Concurrent Sorting Example

package main

import (
    "fmt"
    "sync"

    "github.com/apoplexi24/gpandas"
    "github.com/apoplexi24/gpandas/dataframe"
)

func main() {
    gp := gpandas.GoPandas{}
    df, _ := gp.Read_csv("data.csv")
    
    var wg sync.WaitGroup
    
    // Multiple goroutines can sort simultaneously
    for i := 0; i < 3; i++ {
        wg.Add(1)
        go func(id int, col string) {
            defer wg.Done()
            
            // Safe concurrent sort (creates new DataFrame)
            sorted, _ := df.SortValues(dataframe.SortOptions{
                By: []string{col},
            })
            fmt.Printf("Goroutine %d sorted by %s: %d rows\n", 
                id, col, len(sorted.Index))
        }(i, []string{"Age", "Salary", "Name"}[i])
    }
    
    wg.Wait()
}

 


 

Complete Example: Data Analysis Pipeline

package main

import (
    "fmt"
    "log"

    "github.com/apoplexi24/gpandas"
    "github.com/apoplexi24/gpandas/dataframe"
)

func main() {
    gp := gpandas.GoPandas{}
    
    // Load employee data
    df, err := gp.Read_csv("employees.csv")
    if err != nil {
        log.Fatalf("Failed to load data: %v", err)
    }
    
    fmt.Println("Original Data:")
    fmt.Println(df.String())
    
    // Step 1: Sort by department and salary
    sorted, err := df.SortValues(dataframe.SortOptions{
        By:        []string{"Department", "Salary"},
        Ascending: []bool{true, false},
    })
    if err != nil {
        log.Fatalf("Sort failed: %v", err)
    }
    
    fmt.Println("\nSorted by Department (asc) and Salary (desc):")
    fmt.Println(sorted.String())
    
    // Step 2: Select top earners per department
    topEarners, _ := sorted.Select("Name", "Department", "Salary")
    
    // Step 3: Export results
    _, err = topEarners.ToCSV("top_earners.csv", ",")
    if err != nil {
        log.Printf("Export warning: %v", err)
    }
    
    fmt.Println("\nExported to top_earners.csv")
}

 


 

Performance Considerations

ScenarioPerformanceRecommendation
Small DataFrames (<1000 rows)FastUse any sort configuration
Large DataFrames (>10000 rows)ModerateConsider Inplace: true to save memory
Multi-column sortSlower than singleMinimize number of sort columns
String columnsSlower than numericUse numeric keys when possible

 


 

See Also