flatMap alternatives when printing #220
Replies: 4 comments 5 replies
-
Unfortunately I don't think there is a general correct approach. It might depend heavily on use case. In your example this could be done by parsing 3 things, the stuff before the ":", then the stuff before the "_" and then everything to the end of the line, and then use let p = ParsePrint {
Prefix { $0 != ":" }
":"
Prefix { $0 != "_" }
"_"
Prefix { $0 != "\n" }
}
.filter { key, value1, value2 in
key == value2
}
.map(.convert { key, value1, value2 in
(key, value1 + "_" + value2)
} unapply: { key, value
let (value1, value2) = try value.splitOnce("_")
return (key, value1, value2)
})
p.parse("foo:something_foo") // ✅ ("foo", "something_foo")
p.parse("foo:something_bar") // ❌
p.print("foo", "something_foo") // ✅ "foo:something_foo"
p.print("foo", "something_bar") // ❌ I think the real problem here is that |
Beta Was this translation helpful? Give feedback.
-
Prefixed-length fields simplify parsing logic and support binary strings (e.g. "5:hello"), and are a specific and somewhat common case of what Luke's describing. I recently came across them to parse the "Bulk String" format from the Redis Serialization Protocol (RESP), which looks like this:
where the "$" indicates that the record is a Bulk String, "11" is the field length of the binary-safe string, and the two "\r\n" are delimiters. I resorted to a custom public struct RespBulkStringBody: ParserPrinter {
public func parse(_ input: inout Substring) throws -> Substring {
let length = try Int.parser().parse(&input)
return try self.fixedFieldParser(length).parse(&input)
}
public func print(_ output: Substring, into input: inout Substring) throws {
let length = output.lengthOfBytes(using: .utf8)
try self.fixedFieldParser(length).print(output, into: &input)
try Int.parser().print(length, into: &input)
}
private func fixedFieldParser(_ length: Int)
-> ParsePrint<
ParserBuilder.ZipVOV<String, Prefix<Substring>, String>
> {
ParsePrint {
"\r\n"
Prefix(length)
"\r\n"
}
}
}
let respBulkString = Parse {
"$"
RespBulkStringBody()
}
var input = "$11\r\nhello world\r\n"
let bulkStringValue = try respBulkString.parse(input) // → "hello world"
let printed = try respBulkString.print(bulkStringValue) // → "$11\r\nhello world\r\n"
printed == input // → true Note: For simplicity, my parser is using This is a relatively simple custom parser, but custom parsers feel like a fairly sharp tool that are easy to get wrong (hint: this one's probably not right!), and don't necessarily compose nicely. Of course, I'm not sure how this kind of dependency between parsers could be expressed in a general way, and even if it could, whether it'd be worth doing. Aside: RESP might make for an interesting case study. RESP also uses a prefixed length to represent arrays of (potentially deeply nested) RESP types. |
Beta Was this translation helpful? Give feedback.
-
Very interesting. I'll have to think a bit more on it. It does feel like there is some general tool we could make to help with this, but it is alluding me right now. |
Beta Was this translation helpful? Give feedback.
-
Working on a similar problem. Trying to make my XMLParser printing as well.
I currently have: let openingTagParser = ParsePrint {
"<".utf8
tagHeadParser // returns tagName (String) and attributes ([String: String])
">".utf8
}
let containerTagParser = openingTagParser.flatMap { tagName, attributes in // this flatmap is hard to replace, since I need the tag name to search for the closing tag
let tag = "</\(tagName)>".utf8
Whitespace()
PrefixUpTo(tag).pipe {
Lazy { xmlBodyParser } // parses the content nested in this xml tag
End()
}.map { xml in
return (tagName, attributes, xml) // some mappings that are easily replaceable by a conversion
}.map {
XML.element($0.0, $0.1, $0.2) // other mapping that's replacable by a conversion
}
tag
} Any idea how to approach this? Maybe what kind of restrictions would we need on a flatMap in order for it to be reversible? |
Beta Was this translation helpful? Give feedback.
-
Currently,
flatMap
is not a printer parser so you can't use it if you need to print. I'm wondering what the alternative is and how you could achieve similar functionality - where the output of one parser needs to feed into the next parser in some way - using conversions.To take a trivial example, imagine a string
"key:value"
- the value ofkey
determines how we should parsevalue
, e.g.:foo:something_foo
- thefoo
key would tell us that what follows the ":" needs to be parsed in a particular way.With flat map, you could parse off the "key:" part and then switch over the value of "key", returning an appropriate parser for "value" e.g.
FooParser
orBarParser
which parses of the remainder.What would be the correct approach here?
Beta Was this translation helpful? Give feedback.
All reactions