-
Notifications
You must be signed in to change notification settings - Fork 6
/
webgobbler.py
4118 lines (3597 loc) · 207 KB
/
webgobbler.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/usr/bin/python
# -*- coding: iso-8859-1 -*-
'''
webGobbler 1.2.8
http://sebsauvage.net/python/webgobbler/
=== Purpose ====================================================================
Purpose:
This program creates pictures by assembling random images from the web.
Think of it as attempt to capture the chaos of the human activity,
which the internet is a partial and subjective snapshot of.
Motivation:
I recently discovered WebCollage (http://www.jwz.org/webcollage/)
and debris (http://www.badmofo.org/debris/).
- What's wrong with WebCollage : Not especially pretty, and written in perl.
I hate perl.
- What's wrong with debris : Sources not available. Only works under Windows.
Does not support proxies.
I created gossyp some time ago (http://sebsauvage.net/python/gossyp/).
I told myself I could do the same for images.
I also wanted to train myself better at multi-threading programming.
I wanted to be able to feed those images in a desktop background changer,
a screensaver or whatever I want.
Authors:
Sebastien SAUVAGE, webmaster of http://sebsauvage.net
Kilian, webmaster of http://thesermon.free.fr/
=== Features ===================================================================
webGobbler:
* creates images by assembling random images.
* can get random images from the internet or from a directory of your choice.
* can apply various effect to images (rotation, inversion, mirror,
re-superposition, emboss...).
* can generate images of any size (Want to create a 10000x10000 images ?
No problem !).
* can output many file format (JPEG, BMP, PNG, TGA, TIFF, PDF, PCX, PPM,
XBM...)
* can work as a simple image generator, a webpage generator,
a wallpaper changer, a screensaver...
* can run in command-line mode or GUI mode.
* runs under Windows (all flavors), Linux, MacOS X and any other OS
where Python and the PIL library are available.
* can save/load its configuration to/from the registry or a simple
configuration file in your home directory.
* supports proxies, with or without password.
* is opensource !
* is free !
=== Disclaimer =================================================================
IMPORTANT - READ
This program downloads random images from the internet, which may include
pornography or any morally objectionnable or illegal material.
Due to the random nature of this program, the author of webGobbler cannot be
held responsible for any URL this program has tried to reach, nor the images
downloaded, stored or displayed on the computer.
In consequence:
- this program may not be safe for kids.
- this program is definitely NSFW (not safe for work).
Use at your own risks !
You are warned.
You are advised this program may use copyrighted images.
Thus the images generated by webGobbler are only suitable for private use.
If you want to use it for non-private purposes, you may have to requests grants
from the original image rights owners for each image composing the whole image.
(The URLs of the last pictures used to generate current image can be found in the
last_used_images.html file in the image pool directory.)
=== License ====================================================================
This program is distributed under the OSI-certified zlib/libpng license.
http://www.opensource.org/licenses/zlib-license.php
This software is provided 'as-is', without any express or implied warranty.
In no event will the authors be held liable for any damages arising from
the use of this software.
Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it freely,
subject to the following restrictions:
1. The origin of this software must not be misrepresented; you must not
claim that you wrote the original software. If you use this software
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.
2. Altered source versions must be plainly marked as such, and must not
be misrepresented as being the original software.
3. This notice may not be removed or altered from any source distribution.
=== Requirements ===============================================================
* Python 2.3
* PIL (Python Imaging Library)
* Optional: For the Windows Wallpaper changer and screensaver: ctypes module.
* Optional: For the Gnome wallpaper changer: ctypes module.
* Optional: For the KDE wallpaper changer: python-dcop module.
* Optional: For the configuration GUI: Pmw (Python MegaWidgets)
(provided with webGobbler source)
* Optional: Psyco (to speedup webGobbler)
=== Platforms supported ========================================================
Any platform capable of running Python 2.3 and PIL.
For the screensaver: Windows 95/98/ME/NT/2000/XP/2003 or X-Windows
For the wallpaper changer: Windows 95/98/ME/NT/2000/XP/2003 or Linux with
Gnome or KDE.
webGobbler has been successfully run on Windows, Linux and MacOS X.
=== Technical details ==========================================================
There are 4 different kind of objects in webGobbler:
* The collectors are in charge of spidering the web and downloading images.
They put the downloaded images in the pool.
* The image pool manages the local image collection and ensures a minimal
number of images. If the image pool is going low, it will ask the
collectors to get more images.
* The assemblers take images from the pool and assemble them in various
ways: simple image output as it is, mosaic of images, superposition of images...
* These assemblers can be used by different programs to produce images for
HTML page generation, screensavers, desktop background...
Each collector and the pool run in their own thread, so that the assemblers
and other objects can continue to work while the web is spidered.
The design is modular.
For example, it's easy to write a new collector to spider a specific website.
It's also very easy to write new assemblers.
And assemblers are easy to use in programs.
Still there is room for improvement (and refactoring...).
Currently, existing modules are:
* collector_deviantart: This collector gets random images from
http://deviantART.com, an excellent collaborative art website.
Anyone can post its creations, and visitors can comment. Site contains
photography, drawings, paintings, computer-generated images, etc.
* collector_randomimagesus: http://randomimage.us shows a random,
user-submitted picture on homepage. (This collector is currently
deactivated.)
* collector_askjeevesimages uses the Ask Jeeves Image search engine
(http://pictures.ask.com) by querying with randomly created words
(I will later use a real word list). This search engine even has a "bad"
image filter which should filter most pr0n away.
* collector_yahooimagesearch: This is also an image search engine
(http://search.yahoo.com/images), but with a different database than AskJeeves.
* collector_googleimages uses the Google Image search engine
(http://images.google.com)
* collector_flickr uses random images from the famous Flickr.com website
(http://flickr.com)
* collector_local: If you do not have internet connexion, or a slow one,
or do not want to eat bandwith, this collector can scan local harddisk
to find images (Use the --localonly command-line option to use it.).
Surprisingly enough, this gives not-so-bad results.
* assembler_simple simply outputs a single image, resized to the desired
dimensions (with antialiasing).
* assembler_mosaic creates a mosaic of images (a grid of images). You can
change desired final resolution and the number of images to put in the
mosaic.
* assembler_superpose is currently the most complex one: It superposes
the images with transparency and does some miscellaneous stuff
(compensate for poorly contrasted images, resize images larger than
screen, try to detect "too white" pictures and invert them,
rotate images, paste them with transparency, etc.).
Applications are:
* image_saver uses the assembler_superpose and saves the image
as a simple BMP file every 60 seconds (configurable). This image_saver
is available through the command-line or through a GUI.
* htmlPageGenerator generates an auto-refresh HTML page and and image.
* windowsWallpaperChanger changes the desktop wallpaper under Windows.
* windowsScreensaver is a Windows screensaver.
* gnomeWallpaperChanger changes the desktop wallpaper under Gnome (Linux).
* kdeWallpaperChanger changes the desktop wallpaper under KDE (Linux).
* x11Screensaver is a screensaver for X-Windows.
* There are also other uses (Gnome & KDE wallpaper changer, etc.)
Program source code is full of "FIXME" comments: There is a lot of work
remaining.
=== Examples ===================================================================
Command-line examples:
* python webgobbler.py --tofile webgobbler.bmp
Generate a new image every 60 seconds in 1024x768 (You will have to
wait a few minutes until it gives interesting results.)
* python webgobbler.py --tofile image.png --resolution 640x480 -every 30
Generate a new image at 640x480 every 30 seconds.
* python webgobbler.py --towindowswallpaper --norotation --emboss
Generate a new wallpaper every 60 seconds. Disable rotation and emboss
the generated image.
No use specifying resolution: the wallpaper changer will automatically
pickup screen resolution.
* python webgobbler.py --towindowswallpaper --proxy netcache.myfirm.com:3128
--proxyauth "John Smith:booz99"
Generate Windows wallpaper, and connect to the internet through the proxy
netcache.myfirm.com on port 3128 with the login "John Smith" and the
password "booz99".
* python webgobbler.py --every 120 --invert --saveconfreg
Saves the options in Windows registry for later use with --loadconfreg
or /s (Windows screensaver)
* python webgobbler.py --loadconfreg
Run webGobbler using options saved in the registry.
* python webgobbler.py /c
Call the webGobbler configuration screen. You can tweak all the options
and click the "Save" button.
These options will be used by the screensaver (see /s below) or the
--loadconfreg/--loadconffile.
* python webgobbler.py /s
Call webGobbler as a Windows screenaver. Options will be read from the
registry. (The DOS Window will still appear.)
To create the registry setting with default values, run:
python webgobbler.py --saveconfreg
(Note that if you use the Windows binary, replace "python webgobbler.py" with
"webgobbler_cli.exe" or "webgobbler.exe".)
=== Ideas, Todo, notepad, other stuff... =======================================
IDEA IDEA:
Record all actions (image URL, rotation angle, past coordinates, etc.)
in order to be able to save in a file and replay it in order
to reconstruct the image !!!
:-)
In webgobbler_app:
See if image can be scrolled by dragging it (Like a 'hand' tool).
In webgobbler_app:
offer the possibility to select which collectors to use in the GUI ?
webgobbler_app:
See how to display a tray icon.
See how to change the program icon in taskbar.
See how to mask program in taskbar.
See how to set an icon on the cxFreeze exe.
(ctypes/api Win32 ?)
See:
http://groups-beta.google.com/group/comp.lang.python/browse_thread/thread/d3fd71b3c2424746/5de432c60503c608?q=tkinter+set+icon&rnum=3&hl=en#5de432c60503c608
In webgobbler_app:
Save application preferences in the applicationConfig object.
(set as wallpaper, minimise to tray, show activity...)
FIXME: implement /p /a Windows-screensaver-specific command-line options.
Idea: Why not create a tray icon to start/stop/configure the wallpaper changer ?
Idea: Use the transparency mask of the picture to superpose, and darken image
with this mask a few pixels down and left. This may give a nice "shadow"
effect on each pasted picture. --> to experiment.
How to write a screensaver for Windows:
http://www.christiancoders.com/cgi-bin/articles/show_article.pl?f=briant05292003004836.html
Improve assembler_superpose: non-square transparency,
Fractint-like plasma transparency... ?
Change contrast, brightness, run through external program...
Collectors should implement a delay-between-each-request attribute,
or any other mecanism to be gentle with bandwith.
--> How could I implement a bandwith limitation shared by all collector threads ?
By centralizing downloads ?
Utility methods to develop (for all the assembler modules and/or collectors)
- better white image detection
- banners/spacers/etc. detection (according to URL (see AdBlock), file SHA1,
image dimensions (see Proxomitron), other ?)
- pr0n image detector ? (using flesh-tones detection ?)
Random links:
See http://dir.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Search_Engines_and_Directories/Random_Links/
Image search engines:
http://directory.google.com/Top/Computers/Internet/Searching/Search_Engines/Specialized/Images/
http://dmoz.org/Computers/Internet/Searching/Search_Engines/Specialized/Images/
Add imageshack.us, others ?
Collector to add:
http://www.getty.edu/art/ (Art gallery)
Find other sources (the more the better)
Maybe usefull later:
http://wwwsearch.sourceforge.net/ClientCookie/
This handles cookies automatically.
---------- Building instructions for the Windows application and screensaver -------------------
This involves some manual work.
STEP 1:
Get cxFreeze for your version of Python.
http://starship.python.net/crew/atuining/cx_Freeze/
STEP 2:
Bundle Pmw.py using bundlepmw.py (provided with Pmw).
Copy Pmw.py (roughly 300 kb), PmwBlt.py and PmwColor.py in the directory
of webgobbler sources.
STEP 3:
Run cxFreeze to "compile" the program:
FreezePython.exe --install-dir dist_freeze --target-name=webgobbler_cli.exe webgobbler.py
FreezePython.exe --install-dir dist_freeze --target-name=webgobbler.exe --base-binary=Win32GUI.exe webgobbler.py
STEP 4:
Copy those two DLL in the dist_freeze directory:
copy C:\Python23\DLLs\tcl84.dll .\dist_freeze\
copy C:\Python23\DLLs\tk84.dll .\dist_freeze\
STEP 5:
Copy the whole directory C:\Python24\tcl\tcl8.4 to dist_freeze\libtcltk84\tcl8.4
Copy the whole directory C:\Python24\tcl\tk8.4 to dist_freeze\libtcltk84\tk8.4
(with subdirectories)
STEP 6:
Remove extraneous tcl/tk scripts (demos, http, etc.)
At this point, your have a full-fledge working webGobbler program.
STEP 6:
Usin AutoItv3, compile the following script webGobbler.au3 into webGobbler.exe:
-----SCRIPT STARTS HERE--------------------------------------------------------
; webGobbler caveat: Usage of tcl/tk library in webGobbler imposes that
; when webgobbler.exe is run, the current directory is the same as
; webgobbler.exe.
; Therefore I should put all the webGobbler files (including tcl/tk lib)
; in the c:\windows\system32 directory, along with webgobbler.scr.
; This is not good practice.
;
; This stub (placed in the Windows system folder) runs the real webGobbler
; program in its own director with the right options.
; Hide the tray icon.
Opt("TrayIconHide", 1)
; Read webGobbler installation path from registry.
$regpath = "HKEY_CURRENT_USER\Software\sebsauvage.net\webGobbler"
$regname = "installation_directory"
$installdir = RegRead($regpath,$regname)
if $installdir = "" Then
; If not found in HKCU, Try to read from HKEY_LOCAL_MACHINE instead:
$regpath = "HKEY_LOCAL_MACHINE\Software\sebsauvage.net\webGobbler"
$installdir = RegRead($regpath,$regname)
EndIf
If $installdir = "" Then
MsgBox(16,"webGobbler screensaver","webGobbler installation path could not be found in registry."&@CRLF&"Please reinstall webGobbler."&@CRLF&@CRLF&"(Key "&$regname&" in "&$regpath&")")
Exit(1)
EndIf
; Make sure installation path ends with a antislash (\)
if StringRight($installdir,1) <> "\" Then $installdir = $installdir & "\"
; Make sure webGobbler is installed in this directory.
If NOT FileExists($installdir & "webgobbler.exe") Then
MsgBox(16,"webGobbler screensaver","webGobbler.exe could not be found in directory " & $installdir & @CRLF & "Please reinstall webGobbler.")
Exit(1)
EndIf
; If no command-line parameter is provided, exit.
If $CmdLine[0] = 0 Then Exit
; Get the command-line option
$opt = StringLower($CmdLine[1])
; Call webGobbler
; PS: Looks like windows sometimes call /c with a handle ("/c:651484").
; What's the purpose of that ???
If StringLeft($opt,2) == "/s" Then RunWait($installdir & "webgobbler.exe /s", $installdir)
If StringLeft($opt,2) == "/p" Then Exit(0) ; Preview mode - FIXME: To implement
If StringLeft($opt,2) == "/l" Then Exit(0) ; Preview mode - FIXME: To implement
If StringLeft($opt,2) == "/a" Then Exit(0) ; Change password (Win95/98 only) - FIXME: To implement
; In all other cases, display the configuration screen.
; (For example, right-cliking the .scr and choosing "Configure" will call with no command-line option.)
RunWait($installdir & "webgobbler.exe /c", $installdir)
; FIXME: implement /p with handle.
-----SCRIPT ENDS HERE----------------------------------------------------------
Once compile into an .exe with AutoIt, rename it to webGobbler.scr
CAVEAT: the AutoIt stub has to be patched to support the /s option
(because the default AutoIt stub hooks the /s option.)
webGobbler.scr needs to know where webGobbler.exe is installed.
It read the registry (key installation_directory
in HKEY_CURRENT_USER\Software\sebsauvage.net\webGobbler)
=== FAQ ========================================================================
* Why is it called webGobbler ?
Because it gobbles anything it finds on the web.
(Well, I should have named this something like "Chaos Tapestry" or
"The beautiful trashbin" or "Shreddage". Whatever. Too late.)
* Why choose Python ?
Efficiency, readability, portability, large standard libraries, coolness.
* Why don't you use [AltaVista image search][FastPath image search]
[Insert-your-image-search-engine-name-here] ?
Because most of these search engine have the same database as Yahoo and
Jeeves. Give it a try: search the same word in all those engines: you will
find the same pictures in the same order.
* What's the largest image size webGobbler can generate ?
I don't know, but this should be fairly large (depending on how much memory
your computer has). It's bound to the PIL library. I managed to create
a 10000x10000 image with no problem. It just ate an awfull lot of memory.
* How much memory does webGobbler use ?
It depends mainly on the size of the image to generate. The larger the final
picture, the more memory used. The GUI version uses more memory, of course.
Hint: If you want to create large images, use the command-line version.
* How much CPU does webGobbler use ?
When only spidering the web, almost nothing (usually below the 1% threshold).
When assembling images, slightly more, but on a short period.
If you want webGobbler to never slow you down, don't forget you can change
its process priority so that it will *never* slow other processes.
Under Windows NT/2000/XP, bring the task list (CTRL+SHIFT+ESC), right-click
on Python.exe ou webGobbler.exe, "Set priority" > "Low".
Under *nixes, use nice to set the priority to 19.
Anyway, webGobbler is usually nice on the CPU.
* What image formats are supported by webGobbler ?
webGobbler will only download the following image types from the internet:
jpeg, gif, png, tiff and bmp.
webGobbler could be easily extended to support any format supported by the
PIL library. (For the list of supported formats,
see http://www.pythonware.com/library/pil/handbook/formats.htm)
In output, webGobbler can write all format supported by PIL: As of 2004-09-16,
PIL can write: PNG, BMP, JPEG, GIF, PDF, TIFF, PCX, PPM, XBM, EPS, IM and MSP.
To choose the output format, you just need to use the desired extension
in command line (eg. --tofile mypicture.tiff)
* Will there be porn in images generated by webGobbler ?
It may.
I haven't developped anything to block porn.
Flickr may churnout some porn and DeviantArt.com also has some nudes (rare).
Other collectors are not likely to output porn, because the default behaviour
of search engines is to block porn. If you want to reduce the risk of seeing
porn, deactivate (in code) the two following collector: collector_flickr,
collector_deviantart. But there is no guarantee ! The disclaimer
of webGobbler is still relevant.
* What's this imagepool directory ?
webGobbler stores in this directory the images it has downloaded from the internet.
Once a while, it picks an image from this directory in order to mix it and removes it
from the imagepool directory.
webGobbler will try to keep a constant number of images in this directory,
so it will not grow out of control.
* Why do the files in the image pool have those strange long names ?
webGobbler ignores the name of the original image on the internet. The name
derived from the content of the image itself. This get rids of duplicates
(two identical images with the same name). This also ensures two different
images with the same name will not clash. (This is much like most P2P
programs do to identify files whatever their name.)
If you open an image from the pool with a hex editor, you will see the
original image URL and file name at the end of file
("--- Picture taken from..."). If you download the image and compute its SHA1
(with sha1sum for example), you should find the same SHA1 as in the filename
(WG*.*)
* How can I know which images were used to compose the image ?
Look into the image pool directory (./imagepool): There is a file named
last_used_images.html. It contains the URL of the latest images used
to create the current image. Most recent images are at the bottom of the list.
This file will be kept to a maximum of 1 Mb.
* How can I participate ?
What I need most now is a webGobbler logo. Ideas or images are welcome.
I prefer 2D vector work more than 3D C.G. If your work is integrated into
webGobbler, your name will of course appear in the credits. Don't forget
this work will go under the zlib/libpng license. Right now, I do not seek
direct contributions to code. If you have ideas (about image collection,
assembling or any other feature), I would be most please to hear about them !
* Why not put webGobbler on SourceForge.net ?
I have no time to administrate such a thing (CVS, bug tracking, etc.).
This project is too small to benefit from this.
* What is this 'psyco' thing ?
Psyco accelerates Python programs on x86-compatible processors (Pentium).
Acceleration ranges from x2 to x100 without a single modification in code.
If psyco is installed, this program will automatically use it to run faster.
Don't worry if you don't have psyco: webGobbler will still be fully
functionnal and will run as usual.
* WebGobbler does not create a nice collage of my photos !
WebGobbler is *NOT INTENDED* to create a nice collage of your photos.
It's designed to be a random modern-art generation program.
The "local directory" spider is here only for convenience.
* Thief ! You steal images.
No. I do not steal image. webGobbler does not steal more images than your
average browser either: They both download images and display them on the
computer screen.
Respecting the work of others and their copyrights is *YOUR* responsibility,
not mine, webGobbler's or your browser's.
If you are creating art based on work of others, YOU'RE the
person responsible, whatever tool you use (webGobbler, The Gimp or any other).
* I want to be able to click on the image and be redirected to the original image.
Not in a near future, I fear.
(This is tricky, because a single pixel on the image is the result of the
superposition of dozens of different images. This feature would not be relevant.)
* I want to take only image from a single website.
It's not possible with the current version of webGobbler, and this feature
is not planned in a near future.
As a workaround, you can download the website with tools like HTTrack, then
ask webGobbler to use only images from this directory.
=== History ====================================================================
1.0 beta 3 (2004-xx-xx):
- First public release. zlib/libpng license.
- Code was somewhat cleaned (Lots of work remaining)
- I chose a global config (CONFIG) instead of passing parameters to each constructor.
- Detailed command-line help is now displayed.
- Added more documentation (license, FAQ, etc.)
- Currently, only the image generator (--tofile) and Windows wallpaper changer
(--towindowswallpaper) are implemented and active.
- implemented persistence for assembler_superpose. Still need to add
persistent directory in commande-line.
1.0 beta 4 (2004-09-10): (not released)
- Yahoo Image search "not found" message has changed.
- deviantArt.com link to full view image has changed.
- In order to be more portable, I changed collector_local default
start directory from "C:\" to "/" ("/" is also accepted under Windows)
1.0 beta 5 (2004-09-13):
- Changes in assemble_superpose: New mode which does not darken image
but uses Equalize operation to uniformize channels values.
This give overall better pictures:
- less dark areas
- less grey areas
- more saturated colors, even if all source images are not very
saturated.
- more contrast
- better image mixing
- less rectangular visible edges.
- much more details
- some details can last longer in the final image and shift colors.
That's closer to what I intended to do. A more chaotic picture.
Thinking of it, I should have named this program "Chaos Tapestry".
This program is an attempt to capture the chaos of the human activity,
which the internet is a partial and subjective snapshot of.
This mode is now the default mode for assembler_superpose.
The old mode (beta 4 and previous) is available through the
new "--variante 1" command-line option.
- I chose to de-activate Psyco by default. You will have to uncomment
psyco code to use it (My old Pentium 200 with 64 Mb of RAM does not
seem to appreciate psyco on heavy load).
- randomimages.us collector deactivated because it gives often the same
images. You will have to uncomment it to re-nable it.
This seems to give better overall pictures.
- new Emboss filter (--emboss).
1.0 beta 6 (2004-11-01):
- When search engine are overloaded, the delay has been extended from 10
to 60 seconds to be more gentle with them.
- last_used_images.txt is now last_used_images.html so that it's easier
to view remote images without the hassle of copy-pasting URLs
(thanks to Kilian for suggesting this.)
- webGobbler image branding (lower right corner) font size is a bit larger.
I still need to find a logo for webGobbler (maybe a 2D vector gobbler
with a rainbow comb&tail and a vaccum cleaner in hand ? ;-)
- Added new answers in the FAQ.
- "--norotation" argument added. This disables image rotation.
- "--proxy" argument added. Now you can properly configure proxy from
command-line (without having to touch the code.)
- Also supports Basic proxy authentication (for proxies which require
a login/password): "--proxyauth" argument has been added.
You do not HAVE to provide password in command-line.
If the password is not provided, you will be prompted to enter it.
Example (with password) : --proxyauth "foo bar:mysecretpassword"
Example (without password): --proxyauth "foo bar"
- When downloading an image, its MIME type (Content-Type) is now checked
against a fixed list of known MIME types. This prevents the download
of exotic image formats which would not be supported by PIL.
(See ACCEPTED_MIME_TYPES in code.)
Furthermore, the correct extension will be added to the image file
according to the MIME type so that images in the imagepool will be saved
correctly even if the original URL does not have the correct
extension (such as images provided by CGI).
- Slightly reduced message verbosity so that it fits a bit nicer on screen
when using --debug.
- Better deviantArt.com particularities handling (poetry pages, etc.)
This will slighly reduce the number of outgoing requests to this site.
1.0 beta 7 (2005-01-14):
- corrected a bug in collector_local which would reset its directory pool
a bit too early in some situations.
- collector_deviantart changed to adapt to deviantart.com website changes.
- For the sake of the Netiquette, webGobbler now properly sends
its User-Agent "webGobbler/1.0b7" in HTTP headers instead of
the standard "Python/urllib".
(But for the sake of the Netiquette, should I respect robots rules ?
webGobbler does not technically 'spider' websites.)
- socket timeout set to 15 seconds for the whole program so that the
collector threads are not stuck trying to download an image from a site
which does not respond (or does not respond in a reasonable time).
- In all collectors, sleep() was replaced by self.waituntil so that
collector will react more quickly on shutdown commands.
FIXME: I still need to take care of some eventual time-warping risks
using time.time().
- revamped all collectors so that their _getRandomImage() method
returns more quickly while not slowing down the spidering process.
This way, the threads will die more quickly when requested to shutdown.
This is better for the screensaver.
- Windows Wallpaper changer now automatically uses the current screen
resolution. Command-line specified resolution will be ignored.
- I wrote the core of the Windows Screensaver, using ctypes only.
(Pew ! Win32 API programming sucks.)
(No dependency on Mark Hammond's win32 module, nor pyGame, nor Tkinter,
nor pyScr...). Resulting binaries will be smaller.
With py2exe+UPX, I managed to have the whole webGobbler below 1,3 Mb.
Right now, only the /s (start) option of Windows screensaver is
implemented. I still have to implement /p /c and /a.
You'll have to configure the screensaver by the command-line
with --saveconfreg (see below).
- The screensaver has a seperate thread to handle its window, so that
it will immediately turn off if mouse is moved, even if the other threads
are still downloading or crunching data.
Better responsiveness, happier user.
- I chose to put the screensaver in a separate file (wgwin32screensaver.py).
This may change later.
- Implemented the applicationConfig class which will ease the
storage/retrieval of program configuration.
It can import/export to/from XML, file or Windows registry.
The screensaver (/s) automatically uses Windows registry configuration.
- Added command-line options saveconfreg/loadconfreg/saveconffile/loadconffile
to save/load options to/from registry/file.
Usage example: specify all your options in command-line and use --saveconfreg
Then you will just have to call webGobbler with --loadconfreg to recall
all the options.
The screensaver will automatically use options saved with --saveconfreg
- I stumble upon this: http://www.scroogle.org/gscrape.html
The author is quite right.
After all, Google makes multi-million dollars benefits by indexing
and using *our* sites. I don't earn a single penny out of Google,
so why should I feel guilty of using Google in return ?
So I decided to include a collector for Google Image Search.
webGobbler beeing only for private, non-commercial use, I invoke
the "fair use" right.
- Better error handling (that's why the code is so verbose).
1.0 beta 8 (2005-01-16):
- changes for the deviantart.com website.
1.0 beta 9 (2005-01-19):
- Added the --singleimage command to generate a single image and exit.
- Added the --tohtml command which generates an auto-refreshing HTML
page and its corresponding JPEG image.
Simply open the html page in your browser and the image will
automatically refresh.
You can also generate directly in the directory of your webserver
(--tohtml "c:\wwwroot\fun\webgobbler_current.html")
- Branding was redesigned with a new font and the small eye logo
of my website.
(The font is '04B-11' from http://www.dsg4.com/04/extra/bitmap/)
- When starting a new image, a message is displayed:
"Please wait while the first images are beeing downloaded..."
(just to acknowledge that webGobbler is up and running, because
the first image can appear 60 seconds after starting.)
- Changed the name of some configuration parameters.
- Switched from XML to plain .INI file (It's easier for the users to edit.)
Types are checked against default values on reading.
- In consequence: saveToFileInUserHomedir() and loadFromFileInUserHomedir()
now save .INI-structured files instead of XML.
- saveToRegistryCurrentUser() and loadFromRegistryCurrentUser()
now use a different values in registry for each parameter
(It's easier to edit with RegEdit, and will ease the creation of the
screensaver configuration GUI - probably in Delphi.)
Note that all parameters are saved as text (REG_SZ) in registry.
Types are checked on reading.
- Proxy password is now garbled when saved to INI/file/registry.
IT IS NOT ENCRYPTED and can still be recovered.
But at least it's not stored in plaintext.
- Image download now immediately aborts if the image size announced
in HTTP response headers is too big (in class internetImage).
- internetImage object now returns the textual reason why the image was
discarded (self.discardReason): URL blacklisted, not an image, image
too big, etc. This is displayed when --debug is used.
- URL blacklisting was implemented (see BLACKLIST_URL).
(URLs filters use a "?a AdBlock" syntax.)
- blacklist.imagesha1 and blacklist.url are now exported/imported
in .INI files/registry so that they can be user-customized
(instead of hard-coded).
(They still cannot be customized through command-line:
the configuration (file or registry) needs to be manually updated.)
Values are separated by |. % must all be escaped to %%.
Example: http://*.doubleclick.net|*/adserver/|http://*.xiti.com
- Global application logging mecanism is in place.
(Woao... logging module is really *great* !)
- Cleaner shutdown (I enforced threads shutdown order)
- Prevented simultaneous calls to superpose() in each assembler_superpose
instance in order to prevent CPU and image pool waste.
(self.currentlyassembling attribute, but not read/changed in critical
section because it's not worth.)
- --debug mode will now also write log to a file (webGobbler.log)
Now I can catch almost any unexpected exception and log it to this file
(even in the console-less (screensaver) version of webGobbler)
- psyco was re-enabled (gives a good performance boost, especially for the
screensaver).
- psyco warning is now catched and silenced.
- code was adapted to run both in console and console-less mode
(win32gui.exe in cx_Freeze).
It's now possible to 'compile' webGobbler with cx_Freeze and get rid
of the Dos window (It's better for the screensaver).
You can still use --debug in the console-less version to see
what's going on (in the webGobbler.log file).
- Side effect: You can run the console-less version with:
wg.exe --towindowswallpaper
to have a background process which will change your background.
The process is nice enough to not fail if the internet connection drops.
It will resume downloading and generating images when the internet
connexion is available again.
You can put this executable in your startup menu.
(But to stop it, you will have to kill the process.)
- Still no binary this time: There's work remaining (screensaver
configuration GUI, installer, etc.)
1.0 beta 10 (2005-03-20):
- Added the variante 2 (--variante 2) which mirrors and re-superposes the
final image. It creates a quasi-symmetry in the image.
- Small bug corrected (session saving).
1.0 beta 11 (2005-06-29):
- option --variante 2 changed to --resuperpose
- spurious exceptions trapped on some PIL calls.
- oops... in beta 10, I forgot to update the version number in User-Agent.
- collector_deviantart changed to accomodate DeviantArt.com changes.
1.0 beta 12 (2005-06-30):
- I got a *lot* of exceptions with the new version of PIL (Hence all the
new try/except).
- collector_askjeevesimages changed to adapt website changes.
- AT LEAST ! A configuration GUI developped. You can now access webGobbler
configuration with the /c or --guiconfig option.
The GUI is developped in Tkinter, which makes it portable. Configuration
GUI will automatically pickups registry or .ini file according to what's
available.
Though... It's not completed yet. (For example, the help does not display
help, and I still need to tight up the widgets (alignment, resizing,
data controls, etc.).
I think I will also put some icons to illustrate the different options.
- Therefore: the /c option for the Windows screensaver is now working !
- Still no binary this time (I have to implement help and also get rid
of a packaging & path issues (Is pyco is dead ???)).
1.0 beta 13 (2005-07-02):
- Corrections for Python 2.4.1 (You were getting an exception in the
GUI on the "Save" button).
- Tested successfully with Python 2.4.1, PIL 1.1.5, ctypes 0.9.6,
Pmw 1.2 and cxFreeze 3.0.1.
- assembler_superpose() now closes more quickly when asked to shutdown()
(Previously, he used to finish to process its 10 images before dying.)
It's much better for the screensaver.
- When resolution is changed, the previous image is not trashed anymore:
It's resized. This way, the user will not needlessly lose
previously used CPU cycles and bandwith.
- Debug option added to configuration GUI.
- Added some icons in the GUI.
- Removed the Help area from the GUI.
1.0 beta 14 (2005-07-02):
- collector_askjeevesimages changed again to adapt website changes.
1.0 beta 15 (2005-07-05):
- AT LEAST, a working binary for Windows. No command-line hassles. Rejoy !
- webgobbler_config now derives from Tkinter.Toplevel so that it can be
used as a dialog window in an application.
- corrected a bug in --guiconfig which would not display the window (!)
- assembler_superpose refactored: It does not derive from abstract class
assembler anymore. The new assembler_superpose is more efficient.
(Most methods of this class are now non-blocking.)
- Method assembler_superpose.saveSessionImage() was removed: this assembler
now always save session state once it has completed assembling images.
- The default behaviour of webgobbler.py when no command-line options
are specified is to run in GUI mode.
All command-line options are still available.
The command-line help is available through the new --help option.
- As the application is in a separate thread than image downloading
and crunching, it should be fairly responsive.
(Well... maybe except when shutting down due to network timeouts.)
- CONFIG is no more global: it is passed in each constructor.
(It was a mess, really.)
1.0 beta 16 (2005-07-06):
- Oops... I completely fucked up the distribution of beta 15 because
I picked up the wrong directory. Sorry for that.
- ask.com changed again: collector_askjeevesimages was changed accordingly.
- corrected a bug in GUI which would trigger the Tkinter timer twice.
- corrected a bug when using a proxy with password with the GUI
(didn't work in beta 15).
1.0 beta 17 (2005-07-08):
- In beta 16 binary, I had to include the whole tcl/tk library :-/
In beta 17, it's still the case, but I removed some useless
parts (demo, http...)
- Added a handler for the "pr0n" warning message of Yahoo.
- I used AutoIt to create the .scr which runs the main webGobbler.exe
- Windows binary is now nicely packaged into an installer using InnoSetup
(Great program, really ! And very easy to use.)
- To help the Windows installer detect running instances of webGobbler,
webGobbler creates a mutex under Windows.
- The screensaver seems to be a bit too sensitive to Windows events
and it turns of too easily. I saw this behaviour under Windows 2000
but could not reproduce it under Windows xP. I will have to
investigate this.
1.0 beta 18 (2005-07-14):
- In GUI, added a confirmation box for "Start new image from scratch"
- Added a status in the main Window (to see collectors activity).
- Now you do not have to restart the application if you change proxy settings.
- The initial message "Please wait while the first images are beeing downloaded..."
is now removed as soon as the first images are superposed.
1.0 beta 19 (2005-07-18):
- Requested feature: In GUI, added ability to choose start directory
when the local disk only is used to get images.
- The local disk collector status now also displayed.
- Corrected proper status display ('Off') if a collector was de-actived
(Previously status was not updated after collector de-activation).
1.0 beta 20 (2005-07-19)
- Improved the random word generator.
- Removed the forced GUI update (was useless).
- In the GUI, added the status of the assembler (Now you can see when
it's working).
- In the GUI, the tkInter timer (.after()) which triggers image assembling
was replaced by an internal variable.
- In the GUI, timer is now re-armed with proper delay when user selects
"Update current image now".
- I got rid of a major plague of webGobbler: Sharp edges !
webGobbler now smoothens borders before superposing images
(see the _darkenImageBorder() method). This gives much better results.
Border smoothing is enabled by default (with a 30 pixel border).
If you want to disable border smoothing, set border size to 0.
Border smoothing parameter is available from command-line (--bordersmooth)
and from the GUI.
1.0.0 (2005-09-23)
- Yahoo "no result found" message changed.
- Corrected a bug in the screensaver which would display the configuration
screen when the screensaver stops.
- Apllication seems stable enough to go out of beta stage.
Welcome the version 1.0.0 !
1.0.1 (2005-10-30):
- In Windows screensaver, I lowered the sensibility against the
WM_MOUSEMOVE message.
- Changes in collector_askjeevesimages to adapt search engine changes.
1.2.0 (2006-02-03):
- Added gnomeWallpaperChanger and kdeWallpaperChanger,
Contributed by Kilian (http://thesermon.free.fr/)
Thanks a lot for the contribution !
gnomeWallpaperChanger requires gconf 2 to work.
kdeWallpaperChanger requires python-dcop, which should be installed by
default with KDE.
- Added a new collector: Flickr
- Changes in collector_yahooimagesearch to adapt search engine changes.
- Changed the name of debug*.html files when a search engine change is
detected.
1.2.1 (2006-02-04):
- Added in the GUI:
- A "Save image" button (same as the "Save" menu option)
- An "Auto-save" checkbox which saves the image after each update
as yyyymmmdd_hhmmss.bmp (eg."20060204_2142.bmp").
- An "Update image" button (same as the "Update" menu option)
1.2.2 (2006-02-05):
- In gnomeWallpaperChanger, correction for the libgconf-2 library path.
1.2.3 (2006-02-26):
- At least, the X-Windows screensaver !
Contributed by Kilian (thanks for the work !)
- collector_local now avoids /mnt, /proc and /dev directories.
- improved a *lot* the ImportError messages.
This helps to spot import problems (such as nested imports,
eg. ctypes imported in wgx11screensaver imported in webgobbler.)
1.2.4 (2006-02-28):
- New --scale option to allow the images to be scaled up or down before
beeing superposed. This can be used to create images with more or less
details. For example, use --scale 0.5 to create more detailed images.
- This version was submitted to several download websites (clubic.com,
snapfiles.com, uptodown.com, softpedia.com, download.com, etc.)
1.2.5 (2006-05-02):
- Added keyword search. This feature was requested by several users.
You can use keyword search from the command-line with
the option --keywords (eg. --keywords cats or --keywords "cats dogs").
This option is also available in the GUI.
- In GUI mode, the main window now immediately closes even if a download
is in progress. (This was confusing some users in 1.2.4).
(I'm still trying to find a way to kill a thread in Python... :-/
- corrected a bug which would needlessly waste an image from the imagepool
when the image was larger than screen.
- corrected a bug in the get_unix_lib() library search function.
- corrected a bug in the X11 screensaver.
- changes to accomodate ask.com search engine changes.
- other minor corrections.
- msvcr71.dll is now bundled with the Windows installer.
(This is the Microsoft VisualStudio runtime the Python virtual machine
depends on, and some people do not seem to have this DLL on their system.)
- version tested with Python 2.4.3, ctypes 0.9.9.6, PIL 1.1.5, psyco 1.5.1
and PMW 1.2.
1.2.6 (2006-08-08):
- collector_deviantart changed to adapt to the new version of the website.
1.2.8 (2013-04-08):
- checked against Python 2.6
- upgraded Pmw to 1.3.3
- psycho removed (not maintained anymore).
- sha module replaced with hashlib.
- askjeeves and randomimages.us crawlers removed.
- small refactoring
- flickr and deviantArt crawlers updated.
- flickr keyword search now uses Google Images search engine
(because flickr search sucks big time.)
'''
# ==============================================================================
import sys
# FIXME: Assign an icon to the EXE. (Use http://www.angusj.com/resourcehacker/ ?)
# FIXME: Change the default Tk icon, too.
# When this program is frozen into an EXE with cx_Freeze with the no-console version (Win32GUI.exe)
# stdout, stderr and stdin do not exist.
# Any attempt to write to them (with print for example) would trigger an exception and
# the program.exe would display an exception popup.
# We trap this and create dummy stdin/stdout/stderr so that all print and log statements