Input Data in RecordIO Protobuf format with token ids Hyperparameters Batch size Optimizer Learning rate Number of layers Metric Accuracy Perplexity BLEU score