Skip to content

Commit

Permalink
Update metrics (#29)
Browse files Browse the repository at this point in the history
* reorg / cleanup csv writing

* add new metrics and output

* topic_posted_at and topic_created_at

* add 0 for no replies

* add description for summary

* add better descriptions

* Pr to updatemetrics   add summary by module (#30)

* add function to get module and items

* create unique functions for re-usability

* create data strucuture for modules

* create summary by module fn and remove old code

* add variable and explain in readme for inclusion of summary by module

* update output names and include info in readme

* REFACTOR - being refactor to util fns

* REFACTOR - use functions to clean wordcount string

* REFACTOR - remove comments

* REFACTOR - extract postSummary to function in util

* resolve spacing issue

* fix numbering

* mention all files created

* create warning and error helpers

* add comments and remove uneccessary exports

* remove trailing spaces

* move comment to appropriate place

* use more meaningful name

* resolve header comments

* fix async and await of apis so not sequential

* dont need flat

* create conditional promise

* rm white space

* remove sort comparison

* address refactor comments

* update timestamps and associated functions

* return null if not posted_at instead of 1969 date
  • Loading branch information
alisonmyers authored Sep 17, 2024
1 parent d41b23b commit 38317b7
Show file tree
Hide file tree
Showing 9 changed files with 2,368 additions and 820 deletions.
48 changes: 45 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,18 @@
> 💡 If you are teaching at the University of British Columbia, you may also be interested in the tool `Threadz` which provides visualizations and data from your Canvas discussion forums through a user interface in Canvas. You can learn more about the tool and how to request access in your course from the [LTHub Instructor Guide](https://lthub.ubc.ca/guides/threadz-instructor-guide/). `Threadz` was developed by Eastern Washington University.
# Canvas Discussion

### Data
> `{course_id}-discussion.csv`
This project pulls data via the Canvas API the discussions for the specified Canvas course(s) and exports the results as CSV. The columns exported are:
* 'topic_id',
* 'topic_title',
* 'topic_message',
* 'topic_author_id',
* 'topic_author_name',
* 'topic_timestamp',
* 'topic_created_at',
* 'topic_posted_at',
* 'post_author_id',
* 'post_author_name',
* 'post_id',
Expand All @@ -18,7 +23,42 @@ This project pulls data via the Canvas API the discussions for the specified Can
* 'post_likes',
* 'post_timestamp'

Where a `topic` corresponds to a `discussion_topic` and `post` refers to replies to the `discussion_topic`. If a `discussion_topic` has no posts then you will see the `topic_` columns filled with no corresponding `post_` data. A `post` may have a `post_parent_id ` if it is part of a threaded response.
Where a `topic` corresponds to a `discussion_topic` and `post` refers to all replies to the `discussion_topic`. If a `discussion_topic` has no posts then you will see the `topic_` columns filled with no corresponding `post_` data. A `post` may have a `post_parent_id ` if it is part of a threaded response.

### Summary Data
> `{course_id}-discussion-summary.csv`
We have calculated summary metrics for each topic. The csv with the summary information includes the following columns:
* 'topic_id',
* 'topic_title',
* 'topic_author_id',
* 'topic_author_name',
* 'topic_created_at',
* 'topic_posted_at',
* 'number_of_posts': the total number of posts and replies in the topic
* 'median_posts_word_count': the median word count for all posts and replies to the topic
* 'average_time_to_post_hours': the average time to post or reply from the topic created_at date
* 'first_reply_timestamp': the timestamp of the first post
* 'average_time_to_post_from_first_reply_hours': the average time to post or reply from the first post (for cases where all discussions are released at once, this may be a more meaningful metric of time to reply)
* 'average_posts_per_author': the average posts per author (does not include enrollments with no posts)

Where a `post` is a response to a topic, and a `reply` is a reply to the post.

![alt text](image-1.png)

> `{course_id}-module-discussion-summary.csv`
We have calculated summary metrics at the level of `module` where there are multiple discussion topics. This is optional (see .env creation above) The csv with the summary information includes the following columns:
* 'module_id',
* 'module_name',
* 'module_unlock_at': assuming the course uses an unlock_at date this will be used to calculate,
* 'number_of_posts': the total number of posts and replies in the module
* 'median_posts_word_count': the median word count for all posts and replies to the module topics
* 'average_time_to_post_hours': the average time to post or reply from the module_unlock_at date
* 'first_reply_timestamp': the timestamp of the first post
* 'average_time_to_post_from_first_reply_hours': the average time to post or reply from the first post (for cases where all discussions are released at once, this may be a more meaningful metric of time to reply)
* 'average_posts_per_author': the average posts per author (does not include enrollments with no posts)


## Getting Started
These instructions will get you a copy of the project up and running on your local machine for use with your own API tokens and Canvas domains.
Expand All @@ -39,15 +79,17 @@ These instructions will get you a copy of the project up and running on your loc
1. Create a `.env` file.
1. Add the following: `CANVAS_API_TOKEN={YOUR API TOKEN}`, `CANVAS_API_DOMAIN={YOUR API DOMAIN}`, `COURSE_IDS={YOUR COURSE ID(s)}`. > - At UBC the `CANVAS_API_DOMAIN` is `https://ubc.instructure.com/api/v1`
> - At another institution it might be something like `https://{school}.instructure.com/api/v1`
1. Add `INCLUDE_MODULE_SUMMARY=true` (or `INCLUDE_MODULE_SUMMARY=false`) to indicate whether you would like to include a summary grouped by module. If this is not in the .env it will default to false and no module summary will be created.

Your .env file should look like
```
CANVAS_API_TOKEN=22322...
CANVAS_API_DOMAIN=https://ubc.instructure.com/api/v1
COURSE_IDS=1111,1112
INCLUDE_MODULE_SUMMARY=false
```
1. Run the script. `npm start`.
1. A `{course_id}-discussion.csv` file should be generated with discussion data in the output folder for each provided course_id.
1. A `{course_id}-discussion.csv` and a ` {course_id}-discussion-summary.csv` file should be generated with discussion data in the output folder for each provided course_id. If you have set `INCLUDE_MODULE_SUMMARY` to `true` then you will also see a file `{course_id}-module-discussion-summary.csv`.
## Authors
Expand Down
Binary file added image-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
158 changes: 117 additions & 41 deletions index.js
Original file line number Diff line number Diff line change
@@ -1,13 +1,32 @@
const capi = require('node-canvas-api')
const { flatten } = require('./util')
const writeToCSV = require('./writeToCSV')
require('dotenv').config();
const writeSummaryToCSV = require('./writeSummaryToCSV')
const writeSummaryByModuleToCSV = require('./writeSummaryByModuleToCSV')
require('dotenv').config()

// Check for COURSE_IDS in environment variables
if (!process.env.COURSE_IDS) {
console.error('Error: COURSE_IDS environment variable is not defined.');
process.exit(1); // Exit the script with a non-zero status
const envVariableWarning = (msg) => {
console.info(msg)
}
const envVariableError = (msg) => {
console.error(msg)
process.exit(1)

}
const checkEnvVariable = (varName, errMsg) => {
if (!process.env[varName]) {
if (varName === 'INCLUDE_MODULE_SUMMARY') {
envVariableWarning(errMsg)
} else {
envVariableError(`Error: ${errMsg}. See README for an example.env`)
}
}
}

checkEnvVariable('COURSE_IDS', 'COURSE_IDS environment variable is not defined.')
checkEnvVariable('INCLUDE_MODULE_SUMMARY', 'INCLUDE_MODULE_SUMMARY environment variable is not defined. Define and set to `true` to include summary at module.')
checkEnvVariable('CANVAS_API_TOKEN', 'CANVAS_API_TOKEN environment variable is not defined. You need a token to run this script.')
checkEnvVariable('CANVAS_API_DOMAIN', 'CANVAS_API_DOMAIN environment variable is not defined.')

const getDiscussionTopicIds = courseId => capi.getDiscussionTopics(courseId)
.then(discussions => discussions.map(x => x.id))
Expand All @@ -27,54 +46,111 @@ const getNestedReplies = (replyObj, participants, topicId) => {
postAuthorName: authorName,
postMessage: replyObj.message,
postLikes: replyObj.rating_sum || 0,
postTimestamp: replyObj.created_at,
postTimestamp: new Date(replyObj.created_at),
postParentId: replyObj.parent_id || '',
postId: replyObj.id
}, ...replies]
}

const getDiscussionsAndTopics = async (courseId, topicIds) => {
const fetchDetails = topicId => Promise.all([
capi.getFullDiscussion(courseId, topicId),
capi.getDiscussionTopic(courseId, topicId),
])

const discussionsAndTopics = await Promise.all(
topicIds.map(async topicId => {
const [discussion, topic] = await fetchDetails(topicId)
return { discussion, topic }
})
)

return discussionsAndTopics
}

const processDiscussionTopic = ({ discussion, topic }) => {
const topicId = topic.id
const topicTitle = topic.title
const topicMessage = topic.message
const author = topic.author
const topicCreatedAt = topic.created_at ? new Date(topic.created_at) : null
const topicPostedAt = topic.posted_at ? new Date(topic.posted_at) : null
const participants = discussion.participants
const replies = discussion.view.length > 0
? discussion.view
.filter(x => !x.deleted)
.map(reply => getNestedReplies(reply, participants, topicId))
: []

return {
topicId,
topicTitle,
topicMessage,
topicAuthorId: author.id || '', // the topic author id can be null
topicAuthorName: author.display_name || '', // the topic author can be null
topicCreatedAt,
topicPostedAt,
replies
}
}

const getDiscussions = async courseId => {
const discussionTopicIds = await getDiscussionTopicIds(courseId)
const discussionAndTopic = await Promise.all(
discussionTopicIds
.map(topicId => Promise.all([
capi.getFullDiscussion(courseId, topicId),
capi.getDiscussionTopic(courseId, topicId)
]))
)
return discussionAndTopic.map(([discussion, topic]) => {
const topicId = topic.id
const topicTitle = topic.title
const topicMessage = topic.message
const author = topic.author
const topicCreatedAt = topic.created_at
const participants = discussion.participants
const replies = discussion.view.length > 0
? discussion.view
.filter(x => !x.deleted)
.map(reply => getNestedReplies(reply, participants, topicId))
: []
const discussionsAndTopics = await getDiscussionsAndTopics(courseId, discussionTopicIds)

return discussionsAndTopics.map(processDiscussionTopic)
}


const getPublishedModuleDiscussions = async courseId => {

const modules = await capi.getModules(courseId)

const modulesWithDiscussionItems = await Promise.all(modules.map(async module => {
const items = await capi.getModuleItems(courseId, module.id)
const discussionItems = items.filter(item => item.type === "Discussion" && item.published)

const discussionsAndTopics = await getDiscussionsAndTopics(courseId, discussionItems.map(item => item.content_id))
const processedDiscussions = discussionsAndTopics.map(processDiscussionTopic)

const discussionItemWithDiscussionData = discussionItems.map(discussionItem => {
const discussionAndReplies = processedDiscussions.find(d => d.topicId === discussionItem.content_id)
return {
...discussionItem,
discussionAndReplies
}
})

return {
topicId,
topicTitle,
topicMessage,
topicAuthorId: author.id || '',
topicAuthorName: author.display_name || '',
topicCreatedAt,
replies
...module,
discussionItems: discussionItemWithDiscussionData
}
})
}))

return modulesWithDiscussionItems

}

const courseIds = process.env.COURSE_IDS.split(',').map(id => id.trim());
const courseIds = process.env.COURSE_IDS.split(',').map(id => id.trim())
const returnSummaryByModule = process.env.INCLUDE_MODULE_SUMMARY ? process.env.INCLUDE_MODULE_SUMMARY === 'true' : false

Promise.all(
courseIds.map(courseId =>
getDiscussions(courseId)
.then(discussions => writeToCSV(courseId, discussions))
)
courseIds.map(courseId => {
const basePromise = getDiscussions(courseId).then(discussions =>
Promise.all([
writeToCSV(courseId, discussions), // Writes detailed discussion data to CSV
writeSummaryToCSV(courseId, discussions) // Writes summary of discussion data to CSV
])
)

const additionalPromise = returnSummaryByModule
? getPublishedModuleDiscussions(courseId).then(modulesWithDiscussionItems =>
writeSummaryByModuleToCSV(courseId, modulesWithDiscussionItems) // Writes summary of module data to CSV
)
: Promise.resolve() // No additional operation if condition is false

return Promise.all([basePromise, additionalPromise])
})
).catch(error => {
const detailedErrorMessage = error.message || `An unexpected error occurred: ${error}`
console.error('Error processing discussions:', detailedErrorMessage)
});
console.error('Error processing discussions and modules:', error.message || `An unexpected error occurred: ${error}`)
})
Loading

0 comments on commit 38317b7

Please sign in to comment.