MARLIERE
Sylvain
Genealogy
Photos
France
Grenoble
Chinese
Games
Phone
Computers
Music
Writing
Travel
Links
Contact
Resume
Members
|
unregistered
|
|
|
|
38.103.63.61
|
|
SiteMap
Sound Control :
Graphic Control :
|
Convert paper documents in PDF
Goal: Store the scanned paper documents into a PDF electronic format, avec a reduced filesize (few Kb) and a printable resolution (300dpi). Method applicable under Linux, MacOSX and Windows.
Examples: Hand-written or typed'n'printed notes, postcards, administratives documents, etc...
1/ Building a PDF - Scanned documents processing
- Scans in 300DPI resolution, in raw format as TIFF or BMP without any compression
- Open the file with free software THE GIMP
If it is a photo (gradients and complex colors):
- Save in JPG format compressed at 15%
- If the file is too big (Kb), reduce the image size (pixels)
- Therefore adjust the resolution (dpi) in order to keep the same metric size (mm)
- Save in JPG format compressed at 15%
If it is a document (simple colors and lines):
- Filter noise and colormap with "Layer/Colors/Levels"
- Filter single pixels with "Filters/Enhance/NL Filter"
- Filter again noise and colormap with "Layer/Colors/Levels"
- Convert into Indexed-Colors (2 to 32) or in Black-and-White with "Image/Mode"
- Save in PNG format compressed at 9 (without loss)
2/ Building a PDF - Exporting the image into PDF format
- Linux Command: sam2p -m:dpi:17.27 (72*72/17.27) thus 209.9x297
- or Adobe Acrobat Reader or other sharewares (Windows)
Test of different PNG/JPG-to-PDF export procedures
- Ghostscript crashes (a bit tricky to use?)
- Gimp/GS export PS : JPG compression and A4 format unavoidable
- Inkscape produces very big PDF files
- OpenOfficeDraw: problem with indexed colors
- a2ping compresses in JPG before calling sam2p
- convert keeps image size/resolution but produces bigger files
- sam2p -m:dpi:17.27 produces 300dpi, with the original filesize
3/ Building a PDF - Assembling pages
- Linux Command: pdftk file1.pdf file2.pdf cat output file.pdf
- or "pdfjoin" from package "pdfjam" dependant on package "pdflatex" (Linux)
- or Adobe Acrobat Reader or other sharewares (Windows)
4/ Unbuilding a PDF - Disassembling pages
- Linux Command: pdftk file.pdf cat 2 output file2.pdf
- or Adobe Acrobat Reader or other sharewares (Windows)
5/ Unbuilding a PDF - Extracting pictures
- "pdfimages" from package "xpdf-utils" (Linux), or Gimp/GS
- or Acrobat Reader or Photoshop or other sharewares (Windows)
6/ Information related to A4 paper size
- Metric Size: 210 x 297 mm
- Digital Size: 2480 x 3508 pixels in 300dpi
|
|