Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now
Assignment 4
CS 4341, summer 2024
100 points total (%)
Due: XXXXXXXXXX:59
Delivery: Submit via Canvas
 
For this assignment, you will:
(30pts) Go through the process of building the Seq2Seq Model for text summarization.
Download the Dataset
Fill in your address for the Dataset
Run the code from start to finish
(20pts) Question 1: Explain the tokenized sequences.
(10pts)Question 2: Explain the necessity of multiple inputs and specific dimensions in the
model.
(10pts)Question 3: Analyze the parameter counts in the model's summary.
(10pts)Question 4: Understand the purpose and difference between decoder sequences.
(20pts)Question 5: Discuss and suggest improvements for the generated titles.
 
Tutorial
To create a model that generates news titles based on the content of articles, you can follow a
tutorial that implements text summarization using a Sequence-to-Sequence (Seq2Seq) model,
which is a powerful architecture for tasks like these. Here's a step-by-step guide to help you
through this process, based on the tutorial provided:
Step 1: Download and Explore the Dataset
Student Task: Download the Dataset:
Go to the NYT News Dataset XXXXXXXXXXon Kaggle.(https:
www.kaggle.com/datasets
enda
nmiles/nyt-news-dataset XXXXXXXXXX)
Download the dataset and extract it as NYT_Dataset.csv to your working directory.
Step 2: Data Preprocessing
1. Install Required Li
aries:
2. Import Li
aries:
!pip install pandas numpy tensorflow keras
af:
n986
af:
n988
https:
www.kaggle.com/datasets
endanmiles/nyt-news-dataset XXXXXXXXXX
https:
www.kaggle.com/datasets
endanmiles/nyt-news-dataset XXXXXXXXXX
af:
n1004
3. Load and Preprocess Data:
Student Task: Fill in your address for the Dataset:
4. Tokenize the text:
Question 1 (20pts):
(1) Answer What is the shape of content_seq and title_seq,
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from keras.preprocessing.text import Tokenize
from keras.preprocessing.sequence import pad_sequences
from keras.models import Sequential
from keras.layers import Embedding, LSTM, Dense, TimeDistributed,
RepeatVector, Bidirectional, Dropout, Input
from keras.models import Model
from keras.layers import Concatenate
# Load the dataset
data = pd.read_csv('Your path here to load the dataset')
# Select relevant columns
data = data[['abstract', 'headline']]
# Drop rows with missing values
data.dropna(inplace=True)
# Rename columns for ease of use
data.columns = ['content', 'title']
data['title'] = data['title'].apply(lambda x: 'starttoken ' + x + '
endtoken')
# Tokenize and pad the sequences
max_len_content = 100  # max length for content
max_len_title = 20  # max length for title
# Vocabulary sizes
vocab_size_content = 30000
vocab_size_title = 10000
# Tokenizer for content
tokenizer_content = Tokenizer(vocab_size_content)
tokenizer_content.fit_on_texts(data['content'])
content_seq = tokenizer_content.texts_to_sequences(data['content'])
content_seq = pad_sequences(content_seq, maxlen=max_len_content,
padding='post')
# Tokenizer for title
tokenizer_title = Tokenizer(vocab_size_title)
tokenizer_title.fit_on_texts(data['title'])
title_seq = tokenizer_title.texts_to_sequences(data['title'])
title_seq = pad_sequences(title_seq, maxlen=max_len_title, padding='post')
(2) Answer What is the first row of title_seq
(3)Explain why it is necessary to tokenize the text.
(4)Explain why there are multiple zeros at the end of the title sequence.
Step 3: Building the Seq2Seq Model
1. Define the Encoder:
2. Define the Decoder:
Question 2(20 pts):
(1) Explain why two Inputs(encoder_inputs and decoder_inputs) are needed.
(2) Explain why the dimension of decoder_lstm is latent_dim*2 (instead of latent_dim).
3. Dense Layer for Output:
latent_dim = 256
encoder_inputs = Input(shape=(max_len_content,))
encoder_embedding = Embedding(vocab_size_content, latent_dim,trainable=True)
(encoder_inputs)
encoder_lstm1 = Bidirectional(LSTM(latent_dim, return_state=True,
eturn_sequences=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c =
encoder_lstm1(encoder_embedding)
encoder_lstm2 = Bidirectional(LSTM(latent_dim, return_state=True,
eturn_sequences=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c =
encoder_lstm2(encoder_outputs)
encoder_lstm3 = Bidirectional(LSTM(latent_dim, return_state=True,
eturn_sequences=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c =
encoder_lstm3(encoder_outputs)
state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
decoder_inputs = Input(shape=(max_len_title-1,))
decoder_embedding = Embedding(vocab_size_title, latent_dim, trainable=True)
(decoder_inputs)
decoder_lstm = LSTM(latent_dim*2, return_sequences=True, return_state=True,
dropout=0.4, recu
ent_dropout=0.2)
decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=
[state_h, state_c])
dense = TimeDistributed(Dense(vocab_size_title, activation='softmax'))
output = dense(decoder_concat_input)
af:
n1024
4. Define and Compile the Model:
Question 3(10 pts):
(1) From the output of model.summary(), explain why the two Embeddings have
parameter counts of 7,680,000 and 5,130,000 respectively. (Express them in the form
like a+b, a*b, a^2*b or a*b+c etc.)
Step 4: Training the Model
1. Prepare the Data for Training:
2. Train the Model:
Question 4(10pts):
(1) Explain why there are two decoder sequences during training: decoder_input_seq
and decoder_output_seq.
(2) What is the difference between the value of them?
Step 5: Generating Titles
1. Define the Inference Model:
# Define the model
model = Model([encoder_inputs, decoder_inputs], output)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.summary()
# Split the data
X_train, X_val, y_train, y_val = train_test_split(content_seq, title_seq,
test_size=0.2, random_state=42)
# Create the decoder input and output sequences for training
decoder_input_seq = y_train[:, :-1]
decoder_output_seq = y_train[:, 1:]
# For validation data
decoder_input_val_seq = y_val[:, :-1]
decoder_output_val_seq = y_val[:, 1:]
history = model.fit(
  [X_train, decoder_input_seq],
   np.expand_dims(decoder_output_seq, -1),
   epochs=10,
   batch_size=128,
   validation_data=([X_val, decoder_input_val_seq],
np.expand_dims(decoder_output_val_seq, -1))
)
# Encoder model for inference
encoder_model = Model(encoder_inputs, [encoder_outputs, state_h, state_c])
# Decoder model for inference
af:
n1043
af:
n1065
2. Function to Generate Titles:
3. Generate Titles for New Content:
decoder_state_input_h = Input(shape=(latent_dim*2,))
decoder_state_input_c = Input(shape=(latent_dim*2,))
decoder_hidden_state_input = Input(shape=(max_len_content, latent_dim*2))
decoder_output2, state_h, state_c = decoder_lstm(
   decoder_embedding, initial_state=[decoder_state_input_h,
decoder_state_input_c]
)
decoder_outputs = dense(decoder_output2)
decoder_model = Model(
  [decoder_inputs] + [decoder_hidden_state_input, decoder_state_input_h,
decoder_state_input_c],
  [decoder_outputs] + [state_h, state_c]
)
def decode_sequence(input_seq):
   # Encode the input
   enc_out, enc_h, enc_c = encoder_model.predict(input_seq)
   # Generate empty target sequence of length 1 with the start token
   target_seq = np.zeros((1, 1))
   target_seq[0, 0] = tokenizer_title.word_index['sostok']
   stop_condition = False
   decoded_sentence = ''
   while not stop_condition:
       output_tokens, h, c = decoder_model.predict([target_seq] + [enc_out,
enc_h, enc_c])
       # Sample a token
       sampled_token_index = np.argmax(output_tokens[0, -1, :])
       sampled_token = tokenizer_title.index_word[sampled_token_index]
       decoded_sentence += ' ' + sampled_token
       # Exit condition: either hit max length or find the stop token
       if sampled_token == 'eostok' or sampled_token_index == 0 or
len(decoded_sentence) > max_len_title:
           stop_condition = True
       # Update the target sequence (of length 1)
       target_seq = np.zeros((1,1))
       target_seq[0, 0] = sampled_token_index
       # Update internal states
       enc_h, enc_c = h, c
   return decoded_sentence
Question 5((Open question)(20pts):
(1)Discuss the results of the generated titles compared to the original titles.
(2) What are some possible ways to improve the generated results?
for i in range(10):  # generate titles for 5 random articles
   input_seq = content_seq[i:i+1]
   decoded_sentence = decode_sequence(input_seq)
   print(f"Content: {post_pre['content'][i]}\n")
   print(f"Original Title: {post_pre['title'][i]}\n")
   print(f"Generated Title: {decoded_sentence}\n")
    Tutorial
    Step 1: Download and Explore the Dataset
    Step 2: Data Preprocessing
    Step 3: Building the Seq2Seq Model
    Step 4: Training the Model
    Step 5: Generating Titles
Answered Same Day Aug 19, 2024

Solution

Shubham answered on Aug 20 2024
5 Votes
Question 1
The code cannot compile because constructor MyClassA(int pV) requires an argument, but none is provided when MyClassA object is created. One solution is to pass an integer argument like 5 that will assign the value 5 to v. Another solution is to add a no-argument constructor. In this case, the instance variable v will retain its initial value of 12.
Question 2
The code does not compile because method main is missing the String[] a
ay syntax for its parameter which should be String[] args. In the way to fix it, change String args[] to String[] args. After this co
ection, the instance variable v will be assigned the value 23 when the object m is created using the constructor.
Question 3
The logic e
or occurs because the constructor in MyClassC declares a new local variable v instead of assigning the value to the instance variable v. In the way to fix this requires replacement of int v = pV; with this.v = pV;. This co
ectly assigns pV to the instance variable v. With this fix, v will be assigned the value 23 when the MyClassC object is created.
Question 4
The code cannot be compiled because constructor MyClassF(int pV) is private, making it inaccessible from the MyClassE class. It can be fixed by changing the access modifier of constructor from private to public. After this change, the instance variable v...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Looking For Homework Help? Get Help From Best Experts!

Copy and Paste Your Assignment Here