Most machine learning approaches have focused so far on coupled data settings, where observables and responses (Xi, Yi)i≤n are given jointly. However, in many practical regression problems, such couplings are not available. Instead, the data components (Xi)i≤n and (Yj )j≤n are provided separately requiring a different regression formulation. For example, predicting treatment effect on tumors in personalized medicine requires the measurement of single-cell RNA sequences leading to the destruction of measured cells, and resulting in uncoupled data. Another example is the collection of data components related to personal information by different organisms or companies, which are not allowed to share named/identifiable data due to privacy concerns. Consequently, recent works have addressed this problem in a distribution-to-distribution sense showing promising performance. Nonetheless, they have been restricted to small dimensions and the goal of this project is to develop a scalable approach to dis-tribution-to-distribution regression problems.