Semi-Supervised Machine Learning
Problems where you have a large amount of input data (X) and only some of the
data is labeled (Y) are called semi-supervised learning problems.
These problems sit in between both supervised and unsupervised learning.
A good example is a photo archive where only some of the images are labeled,
(e.g. dog, cat, person) and the majority are unlabeled.
Many real world machine learning problems fall into this area. This is because it
can be expensive or time-consuming to label data as it may require access to
domain experts. Whereas unlabeled data is cheap and easy to collect and store.
You can use unsupervised learning techniques to discover and learn the
structure in the input variables.
You can also use supervised learning techniques to make best guess predictions
for the unlabeled data, feed that data back into the supervised learning algorithm
as training data and use the model to make predictions on new unseen data.