Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make os.Path and os.RelPath implement CharSequence #247

Closed

Conversation

Flowdalic
Copy link
Contributor

@Flowdalic Flowdalic commented Feb 9, 2024

Having os.Path and os.RelPath, in fact os.FilePath, implement CharSequence reduces the boilerplate when using functions that take a input collection and produce some form of textual output.

Consider the following example

def printBashArray(out: Appendable, name: String, content: Seq[CharSequence]) =
  out.append(f"${name}=(\n")
  for c <- content
  do out.append(f"\t${c}\n")
  out.append(")\n")

and

val fooPaths: Seq[os.RelPath]

Currently we have to

printBashArray(out, "FOO_PATHS", fooPaths.map(_.toString))

after this change, we can simply

printBashArray(out, "FOO_PATHS", fooPaths)

(Of course, we could also declare 'content' as Seq[Any], but this blurres the intention.)

@lefou
Copy link
Member

lefou commented Feb 10, 2024

Currently we have to

printBashArray(out, "FOO_PATHS", fooPaths.map(_.toString))

after this change, we can simply

printBashArray(out, "FOO_PATHS", fooPaths.map(_.toString))

I can't spot a difference.

I don't think os.Path qualifies to IS-A Seq[CharSequence]. A os.Path represents a file system resources represented itself by a strings. Isn't path.segements what you need? It returns an Iterator[String].

Having os.Path and os.RelPath, in fact os.FilePath, implement
CharSequence reduces the boilerplate when using functions that take a
input collection and produce some form of textual output.

Consider the following example

def printBashArray(out: Appendable, name: String, content: Seq[CharSequence]) =
  out.append(f"${name}=(\n")
  for c <- content
  do out.append(f"\t${c}\n")
  out.append(")\n")

and

val fooPaths: Seq[os.RelPath]

Currently we have to

printBashArray(out, "FOO_PATHS", fooPaths.map(_.toString))

after this change, we can simply

printBashArray(out, "FOO_PATHS", fooPaths)

(Of course, we could also declare 'content' as Seq[Any], but this
blurres the intention.)
@Flowdalic Flowdalic force-pushed the file-path-char-sequence branch from 383bff3 to 103cbe7 Compare February 11, 2024 09:08
@Flowdalic
Copy link
Contributor Author

I can't spot a difference.

Sorry, copy&paste mistake. Now fixed.

I don't think os.Path qualifies to IS-A Seq[CharSequence].

This PR is about os.Path and os.RelPath to qualify as CharSequence.

I have made good experience with making specialized types1 that are essentially a sequence of characters (minus potentially additionally restrictions) implement the CharSequence interface. Having them implement the CharSequence interface reduces the boilerplate, see my adjusted example, and make the overall handling of them more natural.

1: Examples include MacAddress, EMailAddress and Jid.

@lefou
Copy link
Member

lefou commented Feb 11, 2024

I'm still not convinced. I even don't understand your motivation at all.

If you need a CharSequence out of a os.Path, why can't you just use toString, since every String is already a CharSequence?

printBashArray(out, "FOO_PATHS", fooPaths.toString)

A path is not a char sequence. E.g. charAt or subSequence makes no sense on a multi-segemnt path. If you need a String, use toString.

@lefou lefou closed this Feb 11, 2024
@Flowdalic
Copy link
Contributor Author

Flowdalic commented Feb 11, 2024

printBashArray(out, "FOO_PATHS", fooPaths.toString)

I think you may have missed that fooPaths type is Seq[os.Path], therefore what you suggested will not work. Hopefully, the motivation is now clearer and you may re-evaluate the usefulness of this change and re-consider this PR.

@lefou
Copy link
Member

lefou commented Feb 11, 2024

If feeding Seq[os.Path] is a typical use case of your APIs, you could provide an implicit conversion.

implicit def pathToCharSeq(path: os.Path): CharSequence = path.toString

@Flowdalic
Copy link
Contributor Author

If feeding Seq[os.Path] is a typical use case of your APIs,

Yes, I do it constantly. Again, if you have a dedicated type for something that is (typically) expressed as a String, like MacAddress or EMailAddress, then it is very sensible to have that type implement CharSequence, as you will often end up feeding them to an API that does some sort of output (either UI output or textual output of some sort, like an XML writer).

you could provide an implicit conversion.

Unfortunately, this will not help.

This is not a case where implicit or given Conversation will be applied. As soon as the to-be-converted type is the parameterized type of a generic class, as is the case here, the implicit conversation to the expected type will not be applied.

For example:

#!/usr/bin/env -S scala-cli shebang
//> using scala "3.3.1"
//> using jvm "system"
//> using dep "com.lihaoyi::os-lib:0.9.3"

object GivenConversion:
 given Conversion[os.FilePath, CharSequence] = (f: os.FilePath) => f.toString()

def myMkString(input: Seq[CharSequence]) = input.map(_.toString).mkString("- ", "\n- ", "\n")

val paths = Seq(
  os.Path("/foo"),
  os.Path("/bar"),
)

import GivenConversion.*
myMkString(paths)

Results in

[error] ./os-path-given-conversion.sc:19:12
[error] Found:    (os$minuspath$minusgiven$minusconversion$_.this.paths : Seq[os.Path])
[error] Required: Seq[CharSequence]
[error] myMkString(paths)
[error]            ^^^^^

I wonder where this aversion against subclassing CharSequence stems from besides not understanding the issue doing so fixes. I have not heard any downsides to os.Path implementing the CharSequence interface.

Do you want os-lib users to deal with additional friction when using the os-lib API although it could be avoided?

@SethTisue
Copy link
Contributor

SethTisue commented Feb 12, 2024

fwiw, I agree with @lefou — tons of data types have a string representation that is relatively frequently needed, but that doesn't mean they should all implement CharSequence. the purpose of the CharSequence interface, as I understand it, is to capture the commonality between domain-independent String-like things like StringBuilder and StringBuffer — rather than to be a sort of poor man’s implicit conversion from domain-specific types to String

@SethTisue
Copy link
Contributor

SethTisue commented Feb 12, 2024

Note that the Java standard library designers seem to agree as well:

scala 2.13.12> java.nio.file.Path.of("/tmp")
val res1: java.nio.file.Path = /tmp

scala 2.13.12> res1: java.lang.CharSequence
               ^
               error: type mismatch;
                found   : java.nio.file.Path
                required: CharSequence

@lefou
Copy link
Member

lefou commented Feb 12, 2024

@Flowdalic ,

Trying to help you with your use case, I think your example should work with

implicit def pathToCharSeq(paths: Seq[os.Path]): Seq[CharSequence] = paths.map(_.toString)

I wonder where this aversion against subclassing CharSequence stems from besides not understanding the issue doing so fixes.

Wow. "Aversion". I think you misunderstood something. And I don't like the tone.

I have not heard any downsides to os.Path implementing the CharSequence interface.

I see it as @SethTisue. A domain type using a string internally, isn't necessarily a string itself. Also, I wrote before:

A path is not a char sequence. E.g. charAt or subSequence makes no sense on a multi-segemnt path. If you need a String, use toString.

Doesn't that qualify as at least two downsides?

Do you want os-lib users to deal with additional friction when using the os-lib API although it could be avoided?

Absolutely not. I simply don't agree this is an issue that produces "friction".

@Flowdalic
Copy link
Contributor Author

Flowdalic commented Feb 12, 2024

Wow. "Aversion". I think you misunderstood something. And I don't like the tone.

I am sorry, that was not intended to be in any way an attack. Maybe my understanding what 'aversion' expresses is not correct.

A path is not a char sequence. E.g. charAt or subSequence makes no sense on a multi-segemnt path. If you need a String, use toString.

Doesn't that qualify as at least two downsides?

That is something I do not understand: Assume val myPath: os.Path, then why should myPath.subSequence(3, 7) not be sensible, while myPath.toString().subSequence(3, 7) clearly is?

Note that the Java standard library designers seem to agree as well:

I like the Java standard library very much. I think it is one of the best designed standard libraries out there. That said, it has its warts and this is one of those.

I leave at it that and with saying again that I am sorry if my reply came across as impolite. That wasn't my intention, I just wanted to learn about the arguments against this change.

@lefou
Copy link
Member

lefou commented Feb 12, 2024

@Flowdalic, since you asked, let me add a last comment to this PR.

why should myPath.subSequence(3, 7) not be sensible, while myPath.toString().subSequence(3, 7) clearly is?

The latter case is just some known string API on a string, and we all know strings are arrays or sequences of characters. Since its concept is that old, it's rather "intuitive" API.

But in the former case, we're dealing with a path. So what could subSequence(3,7) on a path mean? Since a path has segments, maybe it could mean the sub-path starting at the third segment?

So, since a CharSequence is just a Java-esque of Seq[Char], but a path is already a Seq[PathSegment] (at least conceptually, in fact, segments is a Seq[String]), how can a type be a Seq of a path segment (String) and a single character (Char) at the same time? Once we spot discrepancies like that in an API, we should rather avoid it.

I understand the desire to automagically support your favorite domain types everywhere. But implementing a CharSequence in os.Path isn't the right choice. It may very well be a legit compromise in other languages, but not in Scala as I know it.

To pick up your other example, the XML writer; in Scala, the preferred or typical way to support different domain type to be written as XML is the concept of type classes.

@Flowdalic
Copy link
Contributor Author

Thanks for the example. That provides some insight about why you reject this change. However the rationale

But in the former case, we're dealing with a path. So what could subSequence(3,7) on a path mean? Since a path has segments, maybe it could mean the sub-path starting at the third segment

i.e., that the behavior of subSequence ()is ambiguous Is flawed, because CharSequence defines its methods based on the return value of its toString() method.

The toString() method is defined as returning "... a string containing the characters in this sequence in the same order as this sequence." And subSequence(start, end) is defined as a subsequence of this sequence starting with the char value at the specified index (the end index is similarly defined).

If you return anything else than the string from the 3 to 7th (exclusive) char, the you would violate the contract of CharSequence.

There is really no ambiguity introduced by this PR about what os.Path.subSequence(3, 7) would return.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants