TENSORFLOW DEVELOPER CERTIFICATION
Although I worked with TensorFlow in Lambda School, I wanted to earn an official TensorFlow Developer certification, so I’ve been using Coursera to study for the exam.
Loss Functions — determines how good or how bad the model’s prediction was
Optimizers — makes a new prediction based on the data from the loss function
Layers — defines the output of that node given an input or set of inputs
QUIZ
Question 1 — The diagram for traditional programming had Rules and Data In, but what came out?
Answers
Question 2 — The diagram for Machine Learning had Answers and Data In, but what came out?
Rules
Question 3 — When I tell a computer what the data represents (i.e. this data is for walking, this data is for running), what is that process called?
Labelling
Question 4 — What is a Dense?
A layer of connected neurons
Question 5 — What does a Loss function do?
Measures how good the current ‘guess’ is
Question 6 — What does the optimizer do?
Generates a new and improved guess
Question 7 — What is Convergence?
The process of getting very close to the correct answer
Question 8 — What does model.fit do?
It trains the neural network to fit one set of values to another
QUIZ
Question 1 — What’s the name of the dataset of Fashion images used in this week’s code?
Fashion MNIST
Question 2 — What do the above mentioned Images look like?
28×28 Greyscale
Question 3 — How many images are in the Fashion MNIST dataset?
70,000
Question 4 — Why are there 10 output neurons?
There are 10 different labels
Question 5 — What does Relu do?
It only returns x if x is greater than zero
Question 6 — Why do you split data into training and test sets?
To test a network with previously unseen data
Question 7 — What method gets called when an epoch finishes?
on_epoch_end
Question 8 — What parameter to you set in your fit function to tell it to use callbacks?
callbacks=
QUIZ
Question 1 — What is a Convolution?
A technique to isolate features in images
Question 2 — What is a Pooling?
A technique to reduce the information in an image while maintaining features
Question 3 — How do Convolutions improve image recognition?
They isolate features in images
Question 4 — After passing a 3×3 filter over a 28×28 image, how big will the output be?
26×26
Question 5 — After max pooling a 26×26 image with a 2×2 filter, how big will the output be?
13×13
Question 6 — Applying Convolutions on top of our Deep neural network will make training:
It depends on many factors. It might make your training faster or slower, and a poorly designed Convolutional layer may even be less efficient than a plain DNN!
QUIZ
Question 1 — Using Image Generator, how do you label images?
It’s based on the directory the image is contained in
Question 2 — What method on the Image Generator is used to normalize the image?
rescale
Question 3 — How did we specify the training size for the images?
The target_size parameter on the training generator
Question 4 — When we specify the input_shape to be (300, 300, 3), what does that mean?
Every Image will be 300×300 pixels, with 3 bytes to define color
Question 5 — If your training data is close to 1.000 accuracy, but your validation data isn’t, what’s the risk here?
You’re overfitting on your training data
Question 6 — Convolutional Neural Networks are better for classifying images like horses and humans because:
All of the above (The distinguishable features may be in different parts of the frame, There’s a wide variety of horses, There’s a wide variety of humans)
Question 7 — After reducing the size of the images, the training results were different. Why?
We removed some convolutions to handle the smaller images
Course #2 — Convolutional Neural Networks in TensorFlow
QUIZ
Question 1 — What does flow_from_directory give you on the ImageGenerator?
All of the Above (ability to easily load images for training, ability to pick the size of training images, ability to automatically label images based on their directory name)
Question 2 — If my Image is sized 150×150, and I pass a 3×3 Convolution over it, what size is the resulting image?
148×148
Question 3 — If my data is sized 150×150, and I use Pooling of size 2×2, what size will the resulting image be?
75×75
Question 4 — If I want to view the history of my training, how can I access it?
Create a variable ‘history’ and assign it to the return of model.fit or model.fit_generator
Question 5 — What’s the name of the API that allows you to inspect the impact of convolutions on the images?
The model.layers API
Question 6 — When exploring the graphs, the loss levelled out at about .75 after 2 epochs, but the accuracy climbed close to 1.0 after 15 epochs. What’s the significance of this?
There was no point training after 2 epochs, as we overfit to the training data
Question 7 — Why is the validation accuracy a better indicator of model performance than training accuracy?
The validation accuracy is based on images that the model hasn’t been trained with, and thus a better indicator of how the model will perform with new images
Question 8 — Why is overfitting more likely to occur on smaller datasets?
Because there’s less likelihood of all possible features being encountered in the training process.
QUIZ
Question 1 — How do you use Image Augmentation in TensorFlow?
Using parameters to the ImageDataGenerator
Question 2— If my training data only has people facing left, but I want to classify people facing right, how would I avoid overfitting?
Use the ‘horizontal_flip’ parameter
Question 3 — When training with augmentation, you noticed that the training is a little slower. Why?
Because the image processing takes cycles
Question 4 — What does the fill_mode parameter do?
It attempts to recreate lost information after a transformation like a shear
Question 5 — When using Image Augmentation with the ImageDataGenerator, what happens to your raw image data on-disk?
Nothing, all augmentation is done in-memory
Question 6 — How does Image Augmentation help solve overfitting?
It manipulates the training set to generate more scenarios for features in the images
Question 7 — When using Image Augmentation my training gets…
Slower
Question 8 — Using Image Augmentation effectively simulates having a larger data set for training.
True
QUIZ
Question 1 — If I put a dropout parameter of 0.2, how many nodes will I lose?
20% of them
Question 2 — Why is transfer learning useful?
Because I can use the features that were learned from large datasets that I may not have access to
Question 3 — How did you lock or freeze a layer from retraining?
layer.trainable = false
Question 4 — How do you change the number of classes the model can classify when using transfer learning? (i.e. the original model handled 1000 classes, but yours handles just 2)
When you add your DNN at the bottom of the network, you specify your output layer with the number of classes you want
Question 5 — Can you use Image Augmentation with Transfer Learning Models?
Yes, because you are adding new layers at the bottom of the network, and you can use image augmentation when training these
Question 6 — Why do dropouts help avoid overfitting?
Because neighbor neurons can have similar weights, and thus can skew the final training
Question 7 — What would the symptom of a Dropout rate being set too high?
The network would lose specialization to the effect that it would be inefficient or ineffective at learning, driving accuracy down
Question 8 — Which is the correct line of code for adding Dropout of 20% of neurons using TensorFlow?
tf.keras.layers.Dropout(0.2)
QUIZ
Question 1 — The diagram for traditional programming had Rules and Data In, but what came out?
Answers
Question 2 — Why does the DNN for Fashion MNIST have 10 output neurons?
The dataset has 10 classes
Question 3 — What is a Convolution?
A technique to extract features from an image
Question 4 — Applying Convolutions on top of a DNN will have what impact on training?
It depends on many factors. It might make your training faster or slower, and a poorly designed Convolutional layer may even be less efficient than a plain DNN!
Question 5 — What method on an ImageGenerator is used to normalize the image?
rescale
Question 6 — When using Image Augmentation with the ImageDataGenerator, what happens to your raw image data on-disk.
Nothing
Question 7 — Can you use Image augmentation with Transfer Learning?
Yes. It’s pre-trained layers that are frozen. So you can augment your images as you train the bottom layers of the DNN with them
Question 8 — When training for multiple classes what is the Class Mode for Image Augmentation?
class_mode=’categorical’
Course #3 — Natural Language Processing in TensorFlow
QUIZ
Question 1 — What is the name of the object used to tokenize sentences?
Tokenizer
Question 2 — What is the name of the method used to tokenize a list of sentences?
fit_on_texts(sentences)
Question 3 — Once you have the corpus tokenized, what’s the method used to encode a list of sentences to use those tokens?
texts_to_sequences(sentences)
Question 4 — When initializing the tokenizer, how do you specify a token to use for unknown words?
oov_token=
Question 5 — If you don’t use a token for out of vocabulary words, what happens at encoding?
The word isn’t encoded, and is skipped in the sequence
Question 6 — If you have a number of sequences of different lengths, how do you ensure that they are understood when fed into a neural network?
Use the pad_sequences object from the tensorflow.keras.preprocessing.sequence namespace
Question 7 — If you have a number of sequences of different length, and call pad_sequences on them, what’s the default result?
They’ll get padded to the length of the longest sequence by adding zeros to the beginning of shorter ones
Question 8 — When padding sequences, if you want the padding to be at the end of the sequence, how do you do it?
Pass padding=’post’ to pad_sequences when initializing it
QUIZ
Question 1 — What is the name of the TensorFlow library containing common data that you can use to train and test neural networks?
TensorFlow Datasets
Question 2 — How many reviews are there in the IMDB dataset and how are they split?
50,000 records, 50/50 train/test split
Question 3 — How are the labels for the IMDB dataset encoded?
Reviews encoded as a number 0-1
Question 4 — What is the purpose of the embedding dimension?
It is the number of dimensions for the vector representing the word encoding
Question 5 — When tokenizing a corpus, what does the num_words=n parameter do?
It specifies the maximum number of words to be tokenized, and picks the most common ‘n’ words
Question 6 — To use word embeddings in TensorFlow, in a sequential layer, what is the name of the class?
tf.keras.layers.Embedding
Question 7 — IMDB Reviews are either positive or negative. What type of loss function should be used in this scenario?
Binary crossentropy
Question 8 — When using IMDB Sub Words dataset, our results in classification were poor. Why?
Sequence becomes much more important when dealing with subwords, but we’re ignoring word positions
QUIZ
Question 1 — Why does sequence make a large difference when determining semantics of language?
Because the order in which words appear dictate their impact on the meaning of the sentence
Question 2 — How do Recurrent Neural Networks help you understand the impact of sequence on meaning?
They carry meaning from one cell to the next
Question 3 — How does an LSTM help understand meaning when words that qualify each other aren’t necessarily beside each other in a sentence?
Values from earlier words can be carried to later ones via a cell state
Question 4 — What keras layer type allows LSTMs to look forward and backward in a sentence?
Bidirectional
Question 5What’s the output shape of a bidirectional LSTM layer with 64 units?
(None, 128)
Question 6 — When stacking LSTMs, how do you instruct an LSTM to feed the next one in the sequence?
Ensure that return_sequences is set to True only on units that feed to another LSTM
Question 7 — If a sentence has 120 tokens in it, and a Conv1D with 128 filters with a Kernal size of 5 is passed over it, what’s the output shape?
(None, 116, 128)
Question 8What’s the best way to avoid overfitting in NLP datasets?
None of the above (Use LSTMs, Use GRUs, Use Conv1D)
QUIZ
Question 1 — What is the name of the method used to tokenize a list of sentences?
fit_on_texts(sentences)
Question 2 — If a sentence has 120 tokens in it, and a Conv1D with 128 filters with a Kernal size of 5 is passed over it, what’s the output shape?
(None, 116, 128)
Question 3 — What is the purpose of the embedding dimension?
It is the number of dimensions for the vector representing the word encoding
Question 4 — IMDB Reviews are either positive or negative. What type of loss function should be used in this scenario?
Binary crossentropy
Question 5 — If you have a number of sequences of different lengths, how do you ensure that they are understood when fed into a neural network?
Use the pad_sequences object from the tensorflow.keras.preprocessing.sequence namespace
Question 6 — When predicting words to generate poetry, the more words predicted the more likely it will end up gibberish. Why?
Because the probability that each word matches an existing phrase goes down the more words you create
Question 7 — What is a major drawback of word-based training for text generation instead of character-based generation?
Because there are far more words in a typical corpus than characters, it is much more memory intensive
Question 8 — How does an LSTM help understand meaning when words that qualify each other aren’t necessarily beside each other in a sentence?
Values from earlier words can be carried to later ones via a cell state
Course #4 — Sequences, Time Series, and Prediction
Metrics
QUIZ
Question 1 — What is an example of a Univariate time series?
Hour by hour temperature
Question 2 — What is an example of a Multivariate time series?
Hour by hour weather
Question 3 — What is imputed data?
A projection of unknown (usually past or missing)
Question 4 — A sound wave is a good example of time series data
True
Question 5 — What is Seasonality?
A regular change in shape of the data
Question 6 — What is a trend?
An overall direction for data regardless of direction
Question 7 — In the context of time series, what is noise?
Unpredictable changes in time series data
Question 8 — What is autocorrelation?
Data that follows a predictable shape, even if the scale is different
Question 9 — What is a non-stationary time series?
One that has a disruptive event breaking trend and seasonality
QUIZ
Question 1 — What is a windowed dataset?
A fixed-size subset of a time series
Question 2 — What does ‘drop_remainder=true’ do?
It ensures that all rows in the data window are the same length by cropping data
Question 3 — What’s the correct line of code to split an n column window into n-1 columns for features and 1 column for a label
dataset = dataset.map(lambda window: (window[:-1], window[-1:]))
Question 4 — What does MSE stand for?
Mean Squared Error
Question 5 — What does MAE stand for?
Mean Absolute Error
Question 6 — If time values are in time[], series values are in series[] and we want to split the series into training and validation at time 1000, what is the correct code?
time_train = time[:split_time]
x_train = series[:split_time]
time_valid = time[split_time:]
x_valid = series[split_time:]
Question 7 — If you want to inspect the learned parameters in a layer after training, what’s a good technique to use?
Assign a variable to the layer and add it to the model using that variable. Inspect its properties after trainingDecompile the model and inspect the parameter set for that layer
Question 8 — How do you set the learning rate of the SGD optimizer?
Use the lr property
Question 9 — If you want to amend the learning rate of the optimizer on the fly, after each epoch, what do you do?
Use a LearningRateScheduler object in the callbacks namespace and assign that to the callback
QUIZ
Question 1 — If X is the standard notation for the input to an RNN, what are the standard notations for the outputs?
Y(hat) and H
Question 2 — What is a sequence to vector if an RNN has 30 cells numbered 0 to 29
The Y(hat) for the last cell
Question 3 — What does a Lambda layer in a neural network do?
Allows you to execute arbitrary code while training
Question 4 — What does the axis parameter of tf.expand_dims do?
Defines the dimension index at which you will expand the shape of the tensor
Question 5 — A new loss function was introduced in this module, named after a famous statistician. What is it called?
Huber loss
Question 6 — What’s the primary difference between a simple RNN and an LSTM?
In addition to the H output, LSTMs have a cell state that runs across all cells
Question 7 — If you want to clear out all temporary variables that tensorflow might have from previous sessions, what code do you run?
tf.keras.backend.clear_session()
Question 8 — What happens if you define a neural network with these two layers?
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
tf.keras.layers.Dense(1),
Your model will fail because you need return_sequences=True after the first LSTM layer
QUIZ
Question 1 — 1.Question 1How do you add a 1 dimensional convolution to your model for predicting time series data?
Use a Conv1D layer type
Question 2 — What’s the input shape for a univariate time series to a Conv1D?
[None, 1]
Question 3 — You used a sunspots dataset that was stored in CSV. What’s the name of the Python library used to read CSVs?
CSV
Question 4 — If your CSV file has a header that you don’t want to read into your dataset, what do you execute before iterating through the file using a ‘reader’ object?
next(reader)
Question 5 — When you read a row from a reader and want to cast column 2 to another data type, for example, a float, what’s the correct syntax?
float(row[2])
Question 6 — What was the sunspot seasonality?
11 or 22 years depending on who you ask
Question 7 — After studying this course, what neural network type do you think is best for predicting time series like our sunspots dataset?
A combination of all of the above (DNN, RNN / LSTM, Convolutions)
Question 8 — Why is MAE a good analytic for measuring accuracy of predictions for time series?
It doesn’t heavily punish larger errors like square errors do