Thanks for being a part of WWDC25!

How did we do? We’d love to know your thoughts on this year’s conference. Take the survey here

Filtered dataFrame displays original dataFrame values

I'm importing a csv file of 299 rows and creating a subset by filtering on one column value (24 rows). I want to Chart just the filtered values. However, when I print one column I get values from the original dataFrame. Any suggestions? Thanks, David The code:

import SwiftUI
import Charts
import TabularData

struct DataPoint: Identifiable {
    var id = UUID() // This makes it conform to Identifiable
    var date: Date
    var value: Double
}

struct ContentView: View {
    
    @State private var dataPoints: [DataPoint] = []
    
    var body: some View {
        Text("Hello")
        Chart {
            ForEach(dataPoints) { dataPoint in
                PointMark(
                    x: .value("Date", dataPoint.date),
                    y: .value("Value", dataPoint.value)
                )
            }
        }
        .frame(height: 300)
        .padding()
        .onAppear(perform: loadData)
    }
    
    func loadData() {
        print("In Loading Data")
        // Load the CSV file
        if let url = Bundle.main.url(forResource: "observations", withExtension: "csv") {
            do {
                let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",")
                var data0 = try DataFrame(contentsOfCSVFile: url, options: options)
                
                let formattingOptions = FormattingOptions(
                    maximumLineWidth: 200,
                    maximumCellWidth: 15,
                    maximumRowCount: 30
                )
                
//                print(data0.description(options: formattingOptions))
                print("Number of Columns: \(data0.columns.count)")
                
                let columnsSet = ["plant_id", "date", "plot", "plantNumber", "plantCount"]
                                
                data0 = try DataFrame(contentsOfCSVFile:url, columns: columnsSet, options: options)
                print("Number of Columns (after columnsSet): \(data0.columns.count)")
                print("Printing data0")
                print(data0.description(options: formattingOptions))
                
                let data = data0.filter { $0["plant_id"] as? Int == 15 }
                print("Printing data")
                print(data.description(options: formattingOptions))

                print(" Number of Rows \(data.rows.count)")
                for i in 0 ... data.rows.count {
//                    print("\(i): \(data["plantCount"][i]!)")
                    if let plantCount = data["plantCount"][i] as? Int {
                        print("\(i): \(plantCount)")
                    } else {
                        print("\(i): Value not found or invalid type")
                    }
                }
                
//
//                var newDataPoints: [DataPoint] = []

// Here I plan to add the filtered data to DataPoint
//
                DispatchQueue.main.async {
                    dataPoints = newDataPoints
                }
            } catch {
                print("Error reading CSV file: \(error)")
            }
        }
        else
        {
            print("Didn't load csv file")
        }
    }
}

struct ContentView_Previews: PreviewProvider {
    static var previews: some View {
        ContentView()
    }
}

Here is the new dataFrame and print output

Printing data
┏━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃     ┃ plant_id ┃ date       ┃ plot  ┃ plantNumber ┃ plantCount ┃
┃     ┃ <Int>    ┃ <String>   ┃ <Int> ┃ <Int>       ┃ <Int>      ┃
┡━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ 0   │       15 │ 2023-09-07 │     1 │           5 │          5 │
│ 32  │       15 │ 2023-09-07 │     2 │          10 │         10 │
│ 38  │       15 │ 2023-09-07 │     2 │          20 │         20 │
│ 66  │       15 │ 2023-09-07 │     4 │          25 │         25 │
│ 77  │       15 │ 2023-09-07 │     5 │           5 │          5 │
│ 99  │       15 │ 2023-09-14 │     7 │          45 │         45 │
│ 142 │       15 │ 2024-05-30 │     1 │          20 │         20 │
│ 162 │       15 │ 2024-05-30 │     4 │           5 │          5 │
│ 169 │       15 │ 2024-05-30 │     5 │          10 │         10 │
│ 175 │       15 │ 2024-05-30 │     7 │          10 │         10 │
│ 188 │       15 │ 2024-07-11 │     1 │          20 │         40 │
│ 199 │       15 │ 2024-07-11 │     2 │           5 │          5 │
│ 215 │       15 │ 2024-07-11 │     5 │          20 │         30 │
│ 220 │       15 │ 2024-07-11 │     7 │          30 │         40 │
│ 236 │       15 │ 2024-09-06 │     1 │          20 │         60 │
│ 238 │       15 │ 2024-09-06 │     2 │          30 │         35 │
│ 248 │       15 │ 2024-09-06 │     5 │           5 │         35 │
│ 254 │       15 │ 2024-09-06 │     7 │          50 │         90 │
│ 267 │       15 │ 2025-05-04 │     1 │          10 │         70 │
│ 273 │       15 │ 2025-05-04 │     2 │          10 │         45 │
│ 282 │       15 │ 2025-05-04 │     5 │          10 │         45 │
│ 287 │       15 │ 2025-05-04 │     7 │          30 │        120 │
│ 292 │       15 │ 2025-05-04 │     8 │          10 │          0 │
│ 297 │       15 │ 2925-05-04 │     3 │          10 │          0 │
└─────┴──────────┴────────────┴───────┴─────────────┴────────────┘
24 rows, 5 columns

 Number of Rows 24
0: 5
1: 80
2: 1
3: 1
4: 1
5: 3
6: 3
7: 1
8: 6
9: 1
10: 1
11: 1
12: 1
13: 10
14: 50
15: 1
16: 2
17: 1
18: 3
19: 8
20: 5
21: 3
22: 7
23: 2
24: 1

Filtered columns behave like Swift ArraySlices. In particular they maintain the original indices, see ArraySlice, Slices Maintain Indices. You need to replace for i in 0 ... data.rows.count with for i in data.rows.indices. The other option is to convert the slice back to a regular DataFrame: let data = DataFrame(data0.filter { ... })

Accepted Answer

Thank you. Creating a new DataFrame worked and is what I thought I was doing.

Converting 'data.rows.indices' to Int seemed to be an issue

Filtered dataFrame displays original dataFrame values
 
 
Q