Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads