Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some problem.. log below #2

Open
cusco opened this issue May 14, 2013 · 16 comments
Open

some problem.. log below #2

cusco opened this issue May 14, 2013 · 16 comments

Comments

@cusco
Copy link

cusco commented May 14, 2013

Hi, I tried the version on http://mike.laiosa.org/software/emlx2maildir/ with --recursive and creates the maildir folder structure but nothing inside besides folder names

used the "https://raw.github.com/mlaiosa/emlx2maildir/master/emlx2maildir.py" nad got the following:

[mail: /home/fbarrancos]# python emlx2maildir.py ./Filipe\ Older.mbox/ test --recursive
Converting './Filipe Older.mbox' -> 'test/'
Converting './Filipe Older.mbox/2010' -> 'test/.2010'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2/8'
Traceback (most recent call last):
File "emlx2maildir.py", line 220, in
main()
File "emlx2maildir.py", line 214, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 201, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 107, in convert_one
length = long(contents[:boundry])
ValueError: invalid literal for long() with base 10: ''

what can I do?

I have got 50G of files like

./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259087.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259932.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259005.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259297.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259566.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259904.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259578.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/._259575.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259669.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259208.emlx

@mlaiosa
Copy link
Owner

mlaiosa commented May 15, 2013

I'll do my best to help you, but I must warn you that its been a long time since I wrote emlx2maildir, so my knowledge of it has become a little rusty. I have a few questions to better understand your situation:

  1. What version of OSX are you using? emlx2maildir was originally written for 10.4, and someone added 10.7 support a little over a year ago. Its never been tested with any other version; we may have to adjust it some.
  2. What level of programming skill and experience do you have, and how proficient are you in Python in particular? Its fine if the answer is "none whatsoever", but even a little bit might help us do this faster.

Mike

On May 14, 2013, at 9:22 AM, cusco wrote:

Hi, I tried the version on http://mike.laiosa.org/software/emlx2maildir/ with --recursive and creates the maildir folder structure but nothing inside besides folder names

used the "https://raw.github.com/mlaiosa/emlx2maildir/master/emlx2maildir.py" nad got the following:

[mail: /home/fbarrancos]# python emlx2maildir.py ./Filipe\ Older.mbox/ test --recursive
Converting './Filipe Older.mbox' -> 'test/'
Converting './Filipe Older.mbox/2010' -> 'test/.2010'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2'
Recursing into './Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2/8'
Traceback (most recent call last):
File "emlx2maildir.py", line 220, in
main()
File "emlx2maildir.py", line 214, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 201, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 107, in convert_one
length = long(contents[:boundry])
ValueError: invalid literal for long() with base 10: ''

what can I do?

I have got 50G of files like

./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259087.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259932.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259005.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259297.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259566.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259904.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259578.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/._259575.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259669.emlx
./Filipe Older.mbox/2006.mbox/Sent.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/5/2/Messages/259208.emlx


Reply to this email directly or view it on GitHub.

@cusco
Copy link
Author

cusco commented May 15, 2013

Hello,

the Mac OS X version is 10.8.3 - I'm not aware of what changed.
Also with the latest emlx2maildir.py here on github, it did convert about 15/20 messages before erroring out.

I do some programming now and then, but never wrote any python what so ever. I can understand that current problem is on line 107
length = long(contents[:boundry])
where contents[:boundry] is different from what is expected

boundry = contents.find("\x0a") <-- this means it will find the first ocurrence of a new line feed "\n" ? as a string position?

can't understand what went wrong there..

but then again, python is not my thing...

@mlaiosa
Copy link
Owner

mlaiosa commented May 16, 2013

You very well might be the first person to try using emlx2maildir on OSX 10.8; we might need to adapt it to some change in the format. I do not have a 10.8 machine available to test with.

Run emlx2maildir with --verbose. Then it will print out the name of each .emlx file as it converts it. This will let you determine the specific .emlx file that we're having trouble with.

Open the troublesome .emlx in a text editor. Does it generally conform to the format I describe here: http://mike.laiosa.org/2009/03/01/emlx.html? It should roughly be a number on a line by itself, followed by an email message including headers, followed by some XML. The error message you reported in your first email suggests that the program is unable to parse the number. If the contents of the message are not confidential, could you zip up the .emlx file and send it to me?

Mike

On May 15, 2013, at 3:27 AM, cusco wrote:

Hello,

the Mac OS X version is 10.8.3 - I'm not aware of what changed.
Also with the latest emlx2maildir.py here on github, it did convert about 15/20 messages before erroring out.

I do some programming now and then, but never wrote any python what so ever. I can understand that current problem is on line 107
length = long(contents[:boundry])
where contents[:boundry] is different from what is expected

boundry = contents.find("\x0a") <-- this means it will find the first ocurrence of a new line feed "\n" ? as a string position?

can't understand what went wrong there..

but then again, python is not my thing...


Reply to this email directly or view it on GitHub.

@cusco
Copy link
Author

cusco commented May 16, 2013

Hello, it errors out on the following file:

Filipe Older.mbox/2010.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2/8/Messages/._82646.emlx: AppleDouble encoded Macintosh file

this is a weird file, it starts with a dot .

perhaps I should ignore files starting with ._ ?

@cusco
Copy link
Author

cusco commented May 16, 2013

hit another problem

Converting message 'Filipe Older.mbox/2010.mbox/Junk.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/2/0/1/Messages/102627.emlx
Traceback (most recent call last):
File "emlx2maildir.py", line 220, in
main()
File "emlx2maildir.py", line 214, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 201, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 109, in convert_one
metadata = parse_plist(contents[boundry+1+length:])
File "emlx2maildir.py", line 67, in parse_plist
xml.sax.parseString(plist_xml, p)
File "/usr/lib/python2.6/xml/sax/init.py", line 49, in parseString
parser.parse(inpsrc)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 381, in external_entity_ref
"")
File "/usr/lib/python2.6/xml/sax/saxutils.py", line 298, in prepare_input_source
f = urllib.urlopen(source.getSystemId())
File "/usr/lib/python2.6/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.6/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.6/urllib.py", line 349, in open_http
errcode, errmsg, headers = h.getreply()
File "/usr/lib/python2.6/httplib.py", line 1064, in getreply
response = self._conn.getresponse()
File "/usr/lib/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.6/socket.py", line 427, in readline
data = recv(1)
IOError: [Errno socket error] [Errno 104] Connection reset by peer

@cusco
Copy link
Author

cusco commented May 16, 2013

restarted the proccess and it went past that file.. lets see.. I'll tell you if it works

@mlaiosa
Copy link
Owner

mlaiosa commented May 16, 2013

Cool. Not sure why it would have an error once and not a second time. Be sure that Mail.app is closed - you don't want it changing things out from under you.

On May 16, 2013, at 7:24 AM, cusco [email protected] wrote:

restarted the proccess and it went past that file.. lets see.. I'll tell you if it works


Reply to this email directly or view it on GitHub.

@cusco
Copy link
Author

cusco commented May 16, 2013

I copied all the folders to a disk and am running the script in a machine that is not using the files...

@cusco
Copy link
Author

cusco commented May 16, 2013

Hi, again same problem...

IOError: [Errno socket error] [Errno 104] Connection reset by peer

Can there be a way to make the script resume??

@cusco
Copy link
Author

cusco commented May 17, 2013

it keeps on erroring each time at a different file... it would be nice to skip the parsed files...

.
Converting message 'Filipe Older.mbox/2010.mbox/Junk.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/1/1/Messages/119901.emlx'
Traceback (most recent call last):
File "emlx2maildir.py", line 220, in
main()
File "emlx2maildir.py", line 214, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 201, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 109, in convert_one
metadata = parse_plist(contents[boundry+1+length:])
File "emlx2maildir.py", line 67, in parse_plist
xml.sax.parseString(plist_xml, p)
File "/usr/lib/python2.6/xml/sax/init.py", line 49, in parseString
parser.parse(inpsrc)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 381, in external_entity_ref
"")
File "/usr/lib/python2.6/xml/sax/saxutils.py", line 298, in prepare_input_source
f = urllib.urlopen(source.getSystemId())
File "/usr/lib/python2.6/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.6/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.6/urllib.py", line 349, in open_http
errcode, errmsg, headers = h.getreply()
File "/usr/lib/python2.6/httplib.py", line 1064, in getreply
response = self._conn.getresponse()
File "/usr/lib/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.6/socket.py", line 427, in readline
data = recv(1)
IOError: [Errno socket error] [Errno 104] Connection reset by peer

@mlaiosa
Copy link
Owner

mlaiosa commented May 17, 2013

It would actually be somewhat tricky to make it pick up where it left off, but I pushed a change this morning that should stop you from getting errors in the first place.

Mike

On May 17, 2013, at 7:03 AM, cusco [email protected] wrote:

it keeps on erroring each time at a different file... it would be nice to skip the parsed files...

.
Converting message 'Filipe Older.mbox/2010.mbox/Junk.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/9/1/1/Messages/119901.emlx'
Traceback (most recent call last):
File "emlx2maildir.py", line 220, in
main()
File "emlx2maildir.py", line 214, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 201, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 109, in convert_one
metadata = parse_plist(contents[boundry+1+length:])
File "emlx2maildir.py", line 67, in parse_plist
xml.sax.parseString(plist_xml, p)
File "/usr/lib/python2.6/xml/sax/init.py", line 49, in parseString
parser.parse(inpsrc)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 207, in feed
self._parser.Parse(data, isFinal)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 381, in external_entity_ref
"")
File "/usr/lib/python2.6/xml/sax/saxutils.py", line 298, in prepare_input_source
f = urllib.urlopen(source.getSystemId())
File "/usr/lib/python2.6/urllib.py", line 86, in urlopen
return opener.open(url)
File "/usr/lib/python2.6/urllib.py", line 207, in open
return getattr(self, name)(url)
File "/usr/lib/python2.6/urllib.py", line 349, in open_http
errcode, errmsg, headers = h.getreply()
File "/usr/lib/python2.6/httplib.py", line 1064, in getreply
response = self._conn.getresponse()
File "/usr/lib/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib/python2.6/socket.py", line 427, in readline
data = recv(1)
IOError: [Errno socket error] [Errno 104] Connection reset by peer


Reply to this email directly or view it on GitHub.

@cusco
Copy link
Author

cusco commented May 19, 2013

Hi,
I guess it was the doctype xml thingie, but this script just got A LOT faster !

Lets see how it goes.

Thank you

@cusco
Copy link
Author

cusco commented May 19, 2013

[mail: /home/fbarrancos]# du -sh test/ && sleep 10 && du -sh test/
2.9G test/
3.1G test/

@cusco
Copy link
Author

cusco commented May 21, 2013

Hello,

It still errored out, I'm going to restart the proccess...

Converting message 'More.mbox/[email protected]/INBOX.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/8/4/Messages/48789.emlx'
Converting message 'More.mbox/[email protected]/INBOX.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/8/4/Messages/48453.emlx'
Traceback (most recent call last):
File "emlx2maildir.py", line 223, in
main()
File "emlx2maildir.py", line 217, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 204, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 112, in convert_one
metadata = parse_plist(contents[boundry+1+length:])
File "emlx2maildir.py", line 70, in parse_plist
xml.sax.parseString(plist_xml, p)
File "/usr/lib/python2.6/xml/sax/init.py", line 49, in parseString
parser.parse(inpsrc)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.6/xml/sax/handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: :12:63: not well-formed (invalid token)

will it error agian in this file? Would you like a copy of this file?

@mlaiosa
Copy link
Owner

mlaiosa commented May 27, 2013

Could you show me the XML from the file with the error?

Mike

On May 21, 2013, at 10:07 AM, cusco [email protected] wrote:

Hello,

It still errored out, I'm going to restart the proccess...

Converting message 'More.mbox/[email protected]/INBOX.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/8/4/Messages/48789.emlx'
Converting message 'More.mbox/[email protected]/INBOX.mbox/317E4A9E-D814-4557-993B-F2C19135E30E/Data/8/4/Messages/48453.emlx'
Traceback (most recent call last):
File "emlx2maildir.py", line 223, in
main()
File "emlx2maildir.py", line 217, in main
dry("Converting message %r" % msg, convert_one, msg, maildir)
File "emlx2maildir.py", line 204, in dry
return act(_args, *_kwargs)
File "emlx2maildir.py", line 112, in convert_one
metadata = parse_plist(contents[boundry+1+length:])
File "emlx2maildir.py", line 70, in parse_plist
xml.sax.parseString(plist_xml, p)
File "/usr/lib/python2.6/xml/sax/init.py", line 49, in parseString
parser.parse(inpsrc)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 107, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python2.6/xml/sax/xmlreader.py", line 123, in parse
self.feed(buffer)
File "/usr/lib/python2.6/xml/sax/expatreader.py", line 211, in feed
self._err_handler.fatalError(exc)
File "/usr/lib/python2.6/xml/sax/handler.py", line 38, in fatalError
raise exception
xml.sax._exceptions.SAXParseException: :12:63: not well-formed (invalid token)

will it error agian in this file? Would you like a copy of this file?


Reply to this email directly or view it on GitHub.

@pascalrobert
Copy link

I got similar errors for some of my users. It's because some invisible chars show up in the plist at the end of the .emlx file. I fixed that by closing Mail, opening the .emlx file in TextWrangler or vi, and remove the invisible chars (TextWrangler won't show them by default, you have to select View -> Text Display -> Show Invisibles).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants