Skip to content

Commit

Permalink
Update DoubleML_and_Feature_Engineering_with_BERT.ipynb
Browse files Browse the repository at this point in the history
  • Loading branch information
chansen776 committed Jun 26, 2024
1 parent 5a03f52 commit a05d4c4
Showing 1 changed file with 6 additions and 5 deletions.
11 changes: 6 additions & 5 deletions PM5/DoubleML_and_Feature_Engineering_with_BERT.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
"\n",
"BERT is a great example of a paradigm called *transfer learning*, which has proved very effective in recent years. In the first step, a network is trained on an unsupervised task using massive amounts of data. In the case of BERT, it was trained to predict missing words and to detect when pairs of sentences are presented in reversed order using all of Wikipedia. This was initially done by Google, using intense computational resources.\n",
"\n",
"Once this network has been trained, it is then used to perform many other supervised tasks using only limited data and computational resources: for example, sentiment classification in tweets or quesiton answering. The network is re-trained to perform these other tasks in such a way that only the final, output parts of the network are allowed to adjust by very much, so that most of the \"information'' originally learned the network is preserved. This process is called *fine tuning*."
"Once this network has been trained, it is then used to perform many other supervised tasks using only limited data and computational resources: for example, sentiment classification in tweets or question answering. The network is re-trained to perform these other tasks in such a way that only the final, output parts of the network are allowed to adjust by very much, so that most of the \"information'' originally learned the network is preserved. This process is called *fine tuning*."
]
},
{
Expand All @@ -38,7 +38,7 @@
"source": [
"##Getting to know BERT\n",
"\n",
"BERT, and many of its variants, are made avialable to the public by the open source [Huggingface Transformers](https://huggingface.co/transformers/) project. This is an amazing resource, giving researchers and practitioners easy-to-use access to this technology.\n",
"BERT, and many of its variants, are made available to the public by the open source [Huggingface Transformers](https://huggingface.co/transformers/) project. This is an amazing resource, giving researchers and practitioners easy-to-use access to this technology.\n",
"\n",
"In order to use BERT for modeling, we simply need to download the pre-trained neural network and fine tune it on our dataset, which is illustrated below."
]
Expand Down Expand Up @@ -126,7 +126,7 @@
},
"outputs": [],
"source": [
"# Load TensorFlow, and ensure GPU is pressent\n",
"# Load TensorFlow, and ensure GPU is present\n",
"# The GPU will massively speed up neural network training\n",
"%tensorflow_version 2.x\n",
"import tensorflow as tf\n",
Expand Down Expand Up @@ -287,6 +287,7 @@
"# Clean numeric data fields (remove all non-digit characters and parse as a numeric value)\n",
"data['number_of_reviews'] = pd.to_numeric(data\n",
" .number_of_reviews\n",
" .str.replace(',', '') # Remove commas\n",
" .str.replace(r\"\\D+\",''))\n",
"data['price'] = (data\n",
" .price\n",
Expand Down Expand Up @@ -794,7 +795,7 @@
"id": "bafy9ftcoBed"
},
"source": [
"Now, let's go one step further and construct a DML estimator of the average price elasticity. In particular, we will model market share $q_i$ as\n",
"Now, let's go one step further and construct a DML estimator of the price elasticity in a partially linear model. In particular, we will model market share $q_i$ as\n",
"$$\\ln q_i = \\alpha + \\beta \\ln p_i + \\psi(d_i) + \\epsilon_i,$$ where $d_i$ denotes the description of product $i$ and $\\psi$ is the composition of text embedding and a linear layer."
]
},
Expand Down Expand Up @@ -944,7 +945,7 @@
"source": [
"# Heterogeneous Elasticities within Major Product Categories\n",
"\n",
"We now look at the major product categories that have many products and we investigate whether the \"within group\" price elasticities"
"We now look at the major product categories that have many products, and we investigate whether the \"within group\" price elasticities vary."
],
"metadata": {
"id": "VCqeRTB_BNEH"
Expand Down

0 comments on commit a05d4c4

Please sign in to comment.