Skip to content

Commit

Permalink
fix(ecdatakit): print marker record describing max num. of columns (#424
Browse files Browse the repository at this point in the history
)

## Description

ECDataKit proceses csv files using `polars` library. It seems that
`pl.DataFrame.read_csv` deduces number of columns from the first record
read and then if it encouters longer record (with more columns)
it ignores additional values.

I could handle it solely on ECDataKit side by reading the whole data
file, prepending it with such marker record and then overwriting it, but
doing it via ecrs seems easier for now.
  • Loading branch information
kkafar authored Aug 25, 2023
1 parent 9a61cad commit 411e836
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion examples/jssp/problem/probe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,12 @@ impl Probe<JsspIndividual> for JsspProbe {
// iterinfo,<generation>,<eval_time>,<sel_time>,<cross_time>,<mut_time>,<repl_time>,<iter_time>

#[inline]
fn on_start(&mut self, _metadata: &ecrs::ga::GAMetadata) {}
fn on_start(&mut self, _metadata: &ecrs::ga::GAMetadata) {
// This is a marker record for ECDataKit. Since it looks like
// polars.DataFrame.read_csv deduces number of columns from the first encoutered
// record it leads to crashes when longer records are encountered deeper in the file.
info!(target: "csv", "event,col_1,col_2,col_3,col_4,col_5,col_6,col_7");
}

fn on_initial_population_created(
&mut self,
Expand Down

0 comments on commit 411e836

Please sign in to comment.