Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise CI and README #256

Merged
merged 10 commits into from
Feb 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .codecov.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,7 @@ coverage:
project: true
patch: false
changes: false
notify:
after_n_builds: 6
comment:
after_n_builds: 6
7 changes: 7 additions & 0 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,13 @@ indent_size = 4
insert_final_newline = true
charset = utf-8-bom

# yaml files (for prettier)
[*.{yml,yaml}]
charset = utf-8
insert_final_newline = true
indent_size = 2
end_of_line = auto

#sort System.* using directives alphabetically, and place them before other usings
dotnet_sort_system_directives_first = true

Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/dotnet.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ env:
NuGetDirectory: ${{ github.workspace }}/nuget

jobs:
build:
build-and-test:
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
Expand Down Expand Up @@ -161,7 +161,7 @@ jobs:
path: ${{ env.NuGetDirectory }}/*nupkg

create-release-push:
needs: [build, pack]
needs: [build-and-test, pack]
runs-on: windows-latest
permissions:
contents: write
Expand Down
5 changes: 4 additions & 1 deletion .github/workflows/super-linter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,11 @@ jobs:
- name: Super-linter
uses: super-linter/super-linter@85f7611e0f7b53c8573cca84aa0ed4344f6f6a4d # v7.2.1
env:
# dotnet format not working, checked during build instead
# format not working, checked during build instead
VALIDATE_CSHARP: false
VALIDATE_DOTNET_SLN_FORMAT_ANALYZERS: false
VALIDATE_DOTNET_SLN_FORMAT_STYLE: false
VALIDATE_DOTNET_SLN_FORMAT_WHITESPACE: false
# prettier defaults are very different than current markdown
VALIDATE_MARKDOWN_PRETTIER: false
VALIDATE_GIT_COMMITLINT: false
Expand Down
104 changes: 55 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sep - ~~Possibly~~ the World's Fastest .NET CSV Parser
# Sep - the World's Fastest .NET CSV Parser
![.NET](https://img.shields.io/badge/net8.0%20net9.0-5C2D91?logo=.NET&labelColor=gray)
![C#](https://img.shields.io/badge/C%23-13.0-239120?labelColor=gray)
[![Build Status](https://github.com/nietras/Sep/actions/workflows/dotnet.yml/badge.svg?branch=main)](https://github.com/nietras/Sep/actions/workflows/dotnet.yml)
Expand All @@ -20,54 +20,60 @@ machine learning use cases.

⭐ Please star this project if you like it. ⭐

* **🌃 Modern** - utilizes features such as
[`Span<T>`](https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/january/csharp-all-about-span-exploring-a-new-net-mainstay),
[Generic Math](https://devblogs.microsoft.com/dotnet/dotnet-7-generic-math/)
([`ISpanParsable<T>`](https://learn.microsoft.com/en-us/dotnet/api/system.ispanparsable-1)/
[`ISpanFormattable`](https://learn.microsoft.com/en-us/dotnet/api/system.ispanformattable)),
[`ref struct`](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/ref-struct),
[`ArrayPool<T>`](https://learn.microsoft.com/en-us/dotnet/api/system.buffers.arraypool-1)
and similar from [.NET 7+ and C#
11+](https://nietras.com/2022/11/26/dotnet-and-csharp-versions/) for a modern
and highly efficient implementation.
* **🔎 Minimal** - a succinct yet expressive API with few options and no hidden
changes to input or output. What you read/write is what you get. E.g. by
default there is no "automatic" escaping/unescaping of quotes or trimming of
spaces. To enable this see [SepReaderOptions](#sepreaderoptions) and
[Unescaping](#unescaping) and [Trimming](#trimming). See
[SepWriterOptions](#sepwriteroptions) for [Escaping](#escaping).
* **🚀 Fast** - blazing fast with both architecture specific and cross-platform
SIMD vectorized parsing incl. 64/128/256/512-bit paths e.g. AVX2, AVX-512
(.NET 8.0+), NEON. Uses
[csFastFloat](https://github.com/CarlVerret/csFastFloat) for fast parsing of
floating points. See [detailed benchmarks](#comparison-benchmarks) for
cross-platform results.
* **🌪️ Multi-threaded** - unparalleled speed with highly efficient parallel CSV
parsing that is [up to 35x faster than
CsvHelper](#floats-reader-comparison-benchmarks), see
[ParallelEnumerate](#parallelenumerate-and-enumerate) and
[benchmarks](#comparison-benchmarks).
* **🌀 Async support** - efficient `ValueTask` based `async/await` support.
Requires C# 13.0+ and for .NET 9.0+ includes `SepReader` implementing
`IAsyncEnumerable<>`. See [Async Support](#async-support) for details.
* **🗑️ Zero allocation** - intelligent and efficient memory management allowing
for zero allocations after warmup incl. supporting use cases of reading or
writing arrays of values (e.g. features) easily without repeated allocations.
* **✅ Thorough tests** - great code coverage and focus on edge case testing
incl. randomized [fuzz testing](https://en.wikipedia.org/wiki/Fuzzing).
* **🌐 Cross-platform** - works on any platform, any architecture supported by
.NET. 100% managed and written in beautiful modern C#.
* **✂️ Trimmable and AOT/NativeAOT compatible** - no problematic reflection or
dynamic code generation. Hence, fully
[trimmable](https://learn.microsoft.com/en-us/dotnet/core/deploying/trimming/prepare-libraries-for-trimming)
and
[Ahead-of-Time](https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/)
compatible. With a simple console tester program executable possible in just a
few MBs. 💾
* **🗣️ Opinionated and pragmatic** - conforms to the essentials of
[RFC-4180](https://www.ietf.org/rfc/rfc4180.txt), but takes an opinionated and
pragmatic approach towards this especially with regards to quoting and line
ends. See section [RFC-4180](#rfc-4180).
**🌃 Modern** - utilizes features such as
[`Span<T>`](https://learn.microsoft.com/en-us/archive/msdn-magazine/2018/january/csharp-all-about-span-exploring-a-new-net-mainstay),
[Generic Math](https://devblogs.microsoft.com/dotnet/dotnet-7-generic-math/)
([`ISpanParsable<T>`](https://learn.microsoft.com/en-us/dotnet/api/system.ispanparsable-1)/
[`ISpanFormattable`](https://learn.microsoft.com/en-us/dotnet/api/system.ispanformattable)),
[`ref struct`](https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/builtin-types/ref-struct),
[`ArrayPool<T>`](https://learn.microsoft.com/en-us/dotnet/api/system.buffers.arraypool-1)
and similar from [.NET 7+ and C#
11+](https://nietras.com/2022/11/26/dotnet-and-csharp-versions/) for a modern
and highly efficient implementation.

**🔎 Minimal** - a succinct yet expressive API with few options and no hidden
changes to input or output. What you read/write is what you get. E.g. by
default there is no "automatic" escaping/unescaping of quotes or trimming of
spaces. To enable this see [SepReaderOptions](#sepreaderoptions) and
[Unescaping](#unescaping) and [Trimming](#trimming). See
[SepWriterOptions](#sepwriteroptions) for [Escaping](#escaping).

**🚀 Fast** - blazing fast with both architecture specific and cross-platform
SIMD vectorized parsing incl. 64/128/256/512-bit paths e.g. AVX2, AVX-512
(.NET 8.0+), NEON. Uses
[csFastFloat](https://github.com/CarlVerret/csFastFloat) for fast parsing of
floating points. See [detailed benchmarks](#comparison-benchmarks) for
cross-platform results.

**🌪️ Multi-threaded** - unparalleled speed with highly efficient parallel CSV
parsing that is [up to 35x faster than
CsvHelper](#floats-reader-comparison-benchmarks), see
[ParallelEnumerate](#parallelenumerate-and-enumerate) and
[benchmarks](#comparison-benchmarks).

**🌀 Async support** - efficient `ValueTask` based `async/await` support.
Requires C# 13.0+ and for .NET 9.0+ includes `SepReader` implementing
`IAsyncEnumerable<>`. See [Async Support](#async-support) for details.
**🗑️ Zero allocation** - intelligent and efficient memory management allowing
for zero allocations after warmup incl. supporting use cases of reading or
writing arrays of values (e.g. features) easily without repeated allocations.

**✅ Thorough tests** - great code coverage and focus on edge case testing
incl. randomized [fuzz testing](https://en.wikipedia.org/wiki/Fuzzing).
**🌐 Cross-platform** - works on any platform, any architecture supported by
NET. 100% managed and written in beautiful modern C#.
**✂️ Trimmable and AOT/NativeAOT compatible** - no problematic reflection or
dynamic code generation. Hence, fully
[trimmable](https://learn.microsoft.com/en-us/dotnet/core/deploying/trimming/prepare-libraries-for-trimming)
and
[Ahead-of-Time](https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/)
compatible. With a simple console tester program executable possible in just a
few MBs. 💾

**🗣️ Opinionated and pragmatic** - conforms to the essentials of
[RFC-4180](https://www.ietf.org/rfc/rfc4180.txt), but takes an opinionated and
pragmatic approach towards this especially with regards to quoting and line
ends. See section [RFC-4180](#rfc-4180).

[Example](#example) | [Naming and Terminology](#naming-and-terminology) | [API](#application-programming-interface-api) | [Limitations and Constraints](#limitations-and-constraints) | [Comparison Benchmarks](#comparison-benchmarks) | [Example Catalogue](#example-catalogue) | [RFC-4180](#rfc-4180) | [FAQ](#frequently-asked-questions-faq) | [Public API Reference](#public-api-reference)

Expand Down