It is currently Wed Jun 03, 2026 10:09 pm


All times are UTC - 5 hours [ DST ]



Post new topic Reply to topic  [ 8 posts ] 
Author Message
 Post subject: Batch background removal on non-uniform images
PostPosted: Tue Feb 26, 2019 9:58 am  (#1) 
Offline
GimpChat Member

Joined: Feb 25, 2019
Posts: 5
GIMP Version: 2.10.8
Operating System: Windows
OS Version: 7 Pro
GIMP Experience: New User
URL or Image link: https://drive.google.com/drive/folders/1jufdwxA69xzJ5

List any relevant plug-ins or scripts:
DivideScannedImages, G'MIC and Batch Image Manipulation



The issue I'm having is the following: I have an archive of about 700,000 tif images that were originally microfilms.

They are all black and white and are basically paper with information on it. I'm trying to figure out a way to trim/crop out the black background, so as to reduce the image size/resolution to show only the paper part as the whole image, but preserving everything inside the paper part.

Thing is, the images are REALLY messy and noisy. The black parts are really noisy, full of white dots and lines, and the paper section of the image is no better, only in reverse (full of black dots on white paper), and the actual information on the paper can be really blurry, sometimes looking like a bunch of black blotches.

So far I've had more success with ImageMagick, thanks to the help an amazing guy in their forum, but the variety of the quality and resolutions of the images seems to be making it impossible to write a single line command line that manages to properly remove the black background and trim the image in order to leave it the way I need it.

I've also had some success using the DivideScannedImages script, but for some reason its ignoring most of the images in the folder I point it towards.

Do you guys have any suggestions?


Share on Facebook Share on Twitter Share on Orkut Share on Digg Share on MySpace Share on Delicious Share on Technorati
Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Tue Feb 26, 2019 2:12 pm  (#2) 
Offline
GimpChat Member

Joined: Sep 20, 2016
Posts: 293
Perhaps a better option would be to convert the filetype the documents are stored as ?

According to my research, tiff is quite big when it comes to filesize.

Perhaps gif or jpg would be a better option ?

source:
http://users.wfu.edu/matthews/misc/grap ... rmats.html


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Tue Feb 26, 2019 2:59 pm  (#3) 
Offline
Script Coder
User avatar

Joined: Oct 25, 2010
Posts: 4812
SteelMassimo wrote:
GIMP Version: 2.10.8
Operating System: Windows
OS Version: 7 Pro
GIMP Experience: New User
URL or Image link: https://drive.google.com/drive/folders/1jufdwxA69xzJ5

List any relevant plug-ins or scripts:
DivideScannedImages, G'MIC and Batch Image Manipulation



The issue I'm having is the following: I have an archive of about 700,000 tif images that were originally microfilms.

They are all black and white and are basically paper with information on it. I'm trying to figure out a way to trim/crop out the black background, so as to reduce the image size/resolution to show only the paper part as the whole image, but preserving everything inside the paper part.

Thing is, the images are REALLY messy and noisy. The black parts are really noisy, full of white dots and lines, and the paper section of the image is no better, only in reverse (full of black dots on white paper), and the actual information on the paper can be really blurry, sometimes looking like a bunch of black blotches.

So far I've had more success with ImageMagick, thanks to the help an amazing guy in their forum, but the variety of the quality and resolutions of the images seems to be making it impossible to write a single line command line that manages to properly remove the black background and trim the image in order to leave it the way I need it.

I've also had some success using the DivideScannedImages script, but for some reason its ignoring most of the images in the folder I point it towards.

Do you guys have any suggestions?


Possible algorithm:

- make a copy of the image
- blur it heavily, this should normally give a light center and very dark edges
- threshold this (threshold to be determined experimentally)
- use the result as a mask to crop the initial image

Of course, on the 700K images some images may require a lighter/stronger threshold. But you can make a first batch with a given threshold value, check visually the results and rerun the rejected with a lighter/stronger threshold.

Giving a URL whe we could find a few sample images would help.

_________________
Image


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Tue Feb 26, 2019 3:05 pm  (#4) 
Offline
GimpChat Member

Joined: Mar 04, 2011
Posts: 2599
ofnuts wrote:
.
Giving a URL whe we could find a few sample images would help.


The IM forum the OP referenced is this

http://www.imagemagick.org/discourse-se ... 7a#p163385

Some images there. Best examples right at the end.


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Tue Feb 26, 2019 3:31 pm  (#5) 
Offline
GimpChat Member

Joined: Feb 25, 2019
Posts: 5
Konstantin wrote:
Perhaps a better option would be to convert the filetype the documents are stored as ?

According to my research, tiff is quite big when it comes to filesize.

Perhaps gif or jpg would be a better option ?

source:


The tif are compressed, most of them at barely 100 kb. Does it matter for the DivideScannedImages script solution? If it does, I can try that. And get back here.

ofnuts wrote:
SteelMassimo wrote:
GIMP Version: 2.10.8
Operating System: Windows
OS Version: 7 Pro
GIMP Experience: New User
URL or Image link:

List any relevant plug-ins or scripts:
DivideScannedImages, G'MIC and Batch Image Manipulation



The issue I'm having is the following: I have an archive of about 700,000 tif images that were originally microfilms.

They are all black and white and are basically paper with information on it. I'm trying to figure out a way to trim/crop out the black background, so as to reduce the image size/resolution to show only the paper part as the whole image, but preserving everything inside the paper part.

Thing is, the images are REALLY messy and noisy. The black parts are really noisy, full of white dots and lines, and the paper section of the image is no better, only in reverse (full of black dots on white paper), and the actual information on the paper can be really blurry, sometimes looking like a bunch of black blotches.

So far I've had more success with ImageMagick, thanks to the help an amazing guy in their forum, but the variety of the quality and resolutions of the images seems to be making it impossible to write a single line command line that manages to properly remove the black background and trim the image in order to leave it the way I need it.

I've also had some success using the DivideScannedImages script, but for some reason its ignoring most of the images in the folder I point it towards.

Do you guys have any suggestions?


Possible algorithm:

- make a copy of the image
- blur it heavily, this should normally give a light center and very dark edges
- threshold this (threshold to be determined experimentally)
- use the result as a mask to crop the initial image

Of course, on the 700K images some images may require a lighter/stronger threshold. But you can make a first batch with a given threshold value, check visually the results and rerun the rejected with a lighter/stronger threshold.

Giving a URL whe we could find a few sample images would help.


I did, it's in the drive link on the beggining. Some 40 images there for testing.

I'm also not allowed to post links since I'm a new user :(


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Wed Feb 27, 2019 8:02 am  (#6) 
Offline
GimpChat Member

Joined: Feb 25, 2019
Posts: 5
@Konstantin, I've done all possible conversion of the images, and the script (DivideScannedImages) still ignores most of them...



ofnuts wrote:
SteelMassimo wrote:
GIMP Version: 2.10.8
Operating System: Windows
OS Version: 7 Pro
GIMP Experience: New User
URL or Image link:

List any relevant plug-ins or scripts:
DivideScannedImages, G'MIC and Batch Image Manipulation



The issue I'm having is the following: I have an archive of about 700,000 tif images that were originally microfilms.

They are all black and white and are basically paper with information on it. I'm trying to figure out a way to trim/crop out the black background, so as to reduce the image size/resolution to show only the paper part as the whole image, but preserving everything inside the paper part.

Thing is, the images are REALLY messy and noisy. The black parts are really noisy, full of white dots and lines, and the paper section of the image is no better, only in reverse (full of black dots on white paper), and the actual information on the paper can be really blurry, sometimes looking like a bunch of black blotches.

So far I've had more success with ImageMagick, thanks to the help an amazing guy in their forum, but the variety of the quality and resolutions of the images seems to be making it impossible to write a single line command line that manages to properly remove the black background and trim the image in order to leave it the way I need it.

I've also had some success using the DivideScannedImages script, but for some reason its ignoring most of the images in the folder I point it towards.

Do you guys have any suggestions?


Possible algorithm:

- make a copy of the image
- blur it heavily, this should normally give a light center and very dark edges
- threshold this (threshold to be determined experimentally)
- use the result as a mask to crop the initial image

Of course, on the 700K images some images may require a lighter/stronger threshold. But you can make a first batch with a given threshold value, check visually the results and rerun the rejected with a lighter/stronger threshold.

Giving a URL whe we could find a few sample images would help.


@ofnuts, you mean using the Batch Image Manip Plugin? Could you name the procedures? I'm really new to GIMP and image manipulation in general.


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Wed Feb 27, 2019 10:16 am  (#7) 
Offline
Script Coder
User avatar

Joined: Oct 25, 2010
Posts: 4812
No I mean using a script... But the algorithm can be tried manually first (just to see if it's worth writing the script)

_________________
Image


Top
 Post subject: Re: Batch background removal on non-uniform images
PostPosted: Wed Feb 27, 2019 1:39 pm  (#8) 
Offline
GimpChat Member

Joined: Feb 25, 2019
Posts: 5
@ofnuts, so here's something that worked...

-Duplicate Layer
-Median Blur
+Neighborhood: Diamond
+Radius: 70
+Abyss Policy: Clamp
+Hight Precision
-Threshold (This didn't really altered the image in any way)
+127 to 255
-Crop to Content
-Delete Layer (The copy layer)

This worked perfectly. Is there a way to script this?


Top
Post new topic Reply to topic  [ 8 posts ] 

All times are UTC - 5 hours [ DST ]



* Login  



Powered by phpBB3 © phpBB Group