MARLIERE
ENFRESCN


Sylvain
Genealogy
Photos
France
Grenoble
Chinese
Games
Phone
Computers
Music
Writing
Travel
Links
Contact
Resume

Members
unregistered
38.103.63.61

SiteMap

Sound Control :
Activate/Mute Music Not/Change for each page

Graphic Control :
Activate/Mute Colors Activate/Mute Images



Convert paper documents in PDF
Goal: Store the scanned paper documents into a PDF electronic format, avec a reduced filesize (few Kb) and a printable resolution (300dpi). Method applicable under Linux, MacOSX and Windows.

Examples: Hand-written or typed'n'printed notes, postcards, administratives documents, etc...

1/ Building a PDF - Scanned documents processing

- Scans in 300DPI resolution, in raw format as TIFF or BMP without any compression
- Open the file with free software THE GIMP

If it is a photo (gradients and complex colors):
- Save in JPG format compressed at 15%
- If the file is too big (Kb), reduce the image size (pixels)
- Therefore adjust the resolution (dpi) in order to keep the same metric size (mm)
- Save in JPG format compressed at 15%

If it is a document (simple colors and lines):
- Filter noise and colormap with "Layer/Colors/Levels"
- Filter single pixels with "Filters/Enhance/NL Filter"
- Filter again noise and colormap with "Layer/Colors/Levels"
- Convert into Indexed-Colors (2 to 32) or in Black-and-White with "Image/Mode"
- Save in PNG format compressed at 9 (without loss)

2/ Building a PDF - Exporting the image into PDF format

- Linux Command: sam2p -m:dpi:17.27 (72*72/17.27) thus 209.9x297
- or Adobe Acrobat Reader or other sharewares (Windows)

Test of different PNG/JPG-to-PDF export procedures
- Ghostscript crashes (a bit tricky to use?)
- Gimp/GS export PS : JPG compression and A4 format unavoidable
- Inkscape produces very big PDF files
- OpenOfficeDraw: problem with indexed colors
- a2ping compresses in JPG before calling sam2p
- convert keeps image size/resolution but produces bigger files
- sam2p -m:dpi:17.27 produces 300dpi, with the original filesize

3/ Building a PDF - Assembling pages

- Linux Command: pdftk file1.pdf file2.pdf cat output file.pdf
- or "pdfjoin" from package "pdfjam" dependant on package "pdflatex" (Linux)
- or Adobe Acrobat Reader or other sharewares (Windows)

4/ Unbuilding a PDF - Disassembling pages

- Linux Command: pdftk file.pdf cat 2 output file2.pdf
- or Adobe Acrobat Reader or other sharewares (Windows)

5/ Unbuilding a PDF - Extracting pictures

- "pdfimages" from package "xpdf-utils" (Linux), or Gimp/GS
- or Acrobat Reader or Photoshop or other sharewares (Windows)

6/ Information related to A4 paper size

- Metric Size: 210 x 297 mm
- Digital Size: 2480 x 3508 pixels in 300dpi
Send mail to
Webmaster
Last Update
09/11/2008
19892 visitors
183458 robots
since 01/01/03
3 users online
0 member online
Page displayed
in 0.97 second