Currently the stock input format / RR gives us a vector already converted
- TODO: separate this into a transform plugin
Thoughts
- Inside the vectorization engine is a great place to put a pluggable transformation system [ TODO: v2 ]
- example: MNIST binarization could be a pluggable transform
- example: custom thresholding on blocks of pixels
Text Pipeline specific stuff
- so right now the TF-IDF stuff has 2 major issues
1.