- All Implemented Interfaces:
- Transform
public class NormalizeTransform
extends Object
implements Transform
For raw images like jpegs we need to perform transforms (normalize)
- here we need to scan across the dataset first to get min / max
Since this is an image specific normalizer we find out min and max across all "columns" / pixels in the image collection
- as opposed to just looking across columns between images in the colleciton
- because pixel intensity is linked across the image grid of pixels
Questions:
- are we able to do this in a way that will parallelize well later?
- probably not, most likely requires a v2 refactor for MR
Label Semantics
- NOTE: dont normalize the LABEL!
1. Image: ImageInputFormat > { [array of doubles], directoryLabelID } // image data, then the directory indexed as an ID int
- Author:
- josh