Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about the training process #17

Open
ironsuperdev opened this issue Jun 10, 2021 · 0 comments
Open

Question about the training process #17

ironsuperdev opened this issue Jun 10, 2021 · 0 comments

Comments

@ironsuperdev
Copy link

def train(
        train_loader: Any,
        epoch: int,
        criterion: Any,
        logger: Logger,
        encoder: Any,
        decoder: Any,
        encoder_optimizer: Any,
        decoder_optimizer: Any,
        model_utils: ModelUtils,
        rollout_len: int = 30,
) -> None:
 for i, (_input, target, helpers) in enumerate(train_loader):
       _input = _input.to(device)   # !!! here one whole batch of data are loaded !!!
       target = target.to(device)

       # Set to train mode
       encoder.train()
       decoder.train()

       # Zero the gradients
       encoder_optimizer.zero_grad()
       decoder_optimizer.zero_grad()

       # Encoder
       batch_size = _input.shape[0]
       input_length = _input.shape[1]
       # output_length = target.shape[1]
       # input_shape = _input.shape[2]

       # Initialize encoder hidden state
       encoder_hidden = model_utils.init_hidden(
           batch_size,
           encoder.module.hidden_size if use_cuda else encoder.hidden_size)

       # Initialize losses
       loss = 0

       # Encode observed trajectory
       for ei in range(input_length):       # !!! in this for loop, complete batch data of 2 sec. are fed through encoder !!!
           encoder_input = _input[:, ei, :]    # !!! each time the different data of certain time stamp ei * 0.1 sec are choosed !!!
           encoder_hidden = encoder(encoder_input, encoder_hidden)   

       # Initialize decoder input with last coordinate in encoder
       decoder_input = encoder_input[:, :2]    # !!! which data in the batch is used?? I don't clearly understand this !!!

       # Initialize decoder hidden state as encoder hidden state
       decoder_hidden = encoder_hidden

       decoder_outputs = torch.zeros(target.shape).to(device)

       # Decode hidden state in future trajectory
       for di in range(rollout_len):
           decoder_output, decoder_hidden = decoder(decoder_input, decoder_hidden)
           decoder_outputs[:, di, :] = decoder_output

           # Update loss
           loss += criterion(decoder_output[:, :2], target[:, di, :2])

           # Use own predictions as inputs at next step
           decoder_input = decoder_output

I don't clearly understand the codes above, especially the positions where I wrote some comments.

  1. during training with batch of data, the encoder is fed with all data of the same recording time. Is this a correct procedure and does it have influence on LSTM's internal states?

  2. after that decoder begins to take the encoder_input[:, :2] as its initial input. But what exactly is this data? Is this the last recorded trajectory in batch? Or all the data of same time points within all trajectories of the whole batch?

Thanks for more explanation on this

BR, Song

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant