update homepage

tsinghua-fib-lab · Jan 7, 2025 · 526c27e · 526c27e
1 parent 0417eb3
commit 526c27e
Showing 1 changed file with 29 additions and 21 deletions.
diff --git a/docs/pages/modeling-track.html b/docs/pages/modeling-track.html
@@ -38,6 +38,14 @@
             border-left: 4px solid #007bff;
         }
 
+        .section h4 {
+            color: #2c3e50;
+            font-size: 1.2em;
+            margin: 20px 0 10px;
+            padding-left: 15px;
+            border-left: 4px solid #007bff;
+        }
+
         /* 链接样式 */
         .section a {
             color: #007bff;
@@ -196,50 +204,50 @@ <h2>Evaluation</h2>
                   After selecting the Top 20 teams to enter the Final Phase, the evaluation results will be calculated based on 40% simulation data and 60% real data. 
                   The evaluation criteria include:
               </p>
-
-              <h3>Star Rating Accuracy</h3>
+
+              <h3>Preference Estimation</h3>
+              <ul>
+                  <li>The preference estimation is calculated based on the star rating accuracy.</li>
+                  <li><b>Metric:</b> 1 - MAE of star ratings, indicating the deviation from actual preferences.</li>
+              </ul>
+
+              <h4>Star Rating Accuracy</h4>
               <ul>
                   <li><b>Metric:</b> Mean Absolute Error (MAE)</li>
                   <li><b>Description:</b> The predicted star ratings will be compared to ground truth values, normalized to the range [0,1].</li>
                   <li><b>Formula:</b></li>
               </ul>
               <p>
-                  <pre>
-                  MAE = (1 / N) * Σ |s_ni_hat - s_ni|
-                  </pre>
-                  where <i>N</i> is the total number of reviews, and <i>s_ni_hat</i> and <i>s_ni</i> are the normalized predicted and ground truth star ratings, respectively. Similar evaluation method has been used in [3].
+                  \[
+                  MAE = \frac{1}{N} \sum_{i=1}^{N} |{\hat{s}_{ni} - s_{ni}}|
+                  \]
+                  where \(N\) is the total number of reviews, and \(\hat{s}_{ni}\) and \(s_{ni}\) are the normalized predicted and ground truth star ratings, respectively. Similar evaluation method has been used in [3].
               </p>
 
-              <h3>Emotional Tone</h3>
+              <h3>Review Generation</h3>
+              <ul>
+                  <li>The review generation is calculated based on the review metrics.</li>
+                  <li><b>Metric:</b> 1 - (Emotional Tone Error * 0.25 + Sentiment Attitude Error * 0.25 + Topic Relevance Error * 0.5), indicating the deviation from actual reviews.</li>
+              </ul>
+
+              <h4>Emotional Tone Error</h4>
               <ul>
                   <li>A vector of emotion scores for the top five emotions in the review text is calculated using a predefined emotion classifier model [1], with each dimension normalized to the range [0,1].</li>
                   <li><b>Metric:</b> Mean Absolute Error (MAE) of normalized emotion scores, reflecting the deviation from the actual emotions.</li>
               </ul>
 
-              <h3>Sentiment Attitude</h3>
+              <h4>Sentiment Attitude Error</h4>
               <ul>
                   <li>The sentiment attitude of the review text is analyzed using <i>nltk.sentiment.SentimentIntensityAnalyzer()</i>, with the resulting value normalized to the range [0,1].</li>
                   <li><b>Metric:</b> Mean Absolute Error (MAE) of normalized sentiment scores, indicating the deviation from actual sentiment attitude.</li>
               </ul>
 
-              <h3>Topic Relevance</h3>
+              <h4>Topic Relevance Error</h4>
               <ul>
                   <li>An embedding vector for the review text is generated using a predefined embedding model [2].</li>
                   <li><b>Metric:</b> Cosine similarity between text embeddings, measuring alignment with the real topics.</li>
               </ul>
 
-              <h3>Preference Estimation</h3>
-              <ul>
-                  <li>The preference estimation is calculated based on the star rating accuracy.</li>
-                  <li><b>Metric:</b> 1 - MAE of star ratings, indicating the deviation from actual preferences.</li>
-              </ul>
-
-              <h3>Review Generation</h3>
-              <ul>
-                  <li>The review generation is calculated based on the review metrics.</li>
-                  <li><b>Metric:</b> 1 - (Emotional Tone Error * 0.25 + Sentiment Attitude Error * 0.25 + Topic Relevance Error * 0.5), indicating the deviation from actual reviews.</li>
-              </ul>
-
               <h3>Overall Quality</h3>
               <ul>
                   <li>The overall quality is calculated based on the preference estimation and review generation.</li>