English-Spanish Healthcare.gov Corpus v1

This is the English-Spanish corpus of the healthcare.gov/cuidadodesalud.gov website as of April 2019 in TMX format. It contains 113,251 English source words in non-duplicate, randomly ordered segments. The corpus can be used to train machine translation systems. Find out how at https://polyglot.technology


This translation memory of healthcare.gov translations © Polyglot Technology LLC is made available under the Open Data Commons Attribution License: http://opendatacommons.org/licenses/by/1.0. Individual contents of the database are in the public domain.

Known Issues

Some bulleted lists from the originating pages are merged together into single segments.

  • Size519 KB
  • English Source Words (Deduplicated)113,251


