Call for participation

Leveraging tool support for the analysis of computer-mediated activities

The analysis of multimodal computer-mediated human interaction data, such as that often found in CSCL and related fields, is difficult: the diverse nature of this data and its sheer quantity is challenging enough, but a further obstacle is introduced by the complex nature of these interactions (Dyke et al., 2009). A recent series of workshops on the topic of productive multivocality in CSCL research has shown the value, both of using tools to support the creation and sharing of analytic representations, and of using state of the art language technologies to help explore corpora of discussion data.

It has been our experience that, when researchers are not also tool authors or working in close collaboration with tool authors, they rarely use the full potential offered by the state of the art in analysis software. We hypothesize that there are three principal reasons for this situation. Most tools rarely develop beyond “research prototype” status, leading to poor usability and documentation, and few resources available for maintenance. Furthermore, the cost of setting up prior to be able to perform analyses (installation, learning to use the software, data import) is often great or hard to evaluate. Last, it is often very hard to assess whether a given tool is suited to the researcher’s need (each need having a slightly ad-hoc component making comparison with other cases difficult). In spite of these problems, analysis tools do have the potential to save time and allow researchers to focus on the kinds of analytic work which can only be performed by humans, as opposed to the more “menial” tasks which can be delegated to a computer.

In this tutorial, we explore how two such tools (Tatiana1 and SIDE2, both of which are freely downloadable) can be of use to researchers in the learning sciences and who would like to:

  • Collect and analyse multimodal and multimedia data (e.g. video and log files, in concert)
  • Create interactive visualisations which help explore and examine the data
  • Record annotations, codes and relationships between events
  • Share analyses with other researchers (advisors, collaborators, etc.)
  • Combine multiple analytic perspectives
  • Use language technologies to assist coding and labelling
  • Use language technologies to explore and summarize large quantities of data
  • Find patterns of interest and statistical outliers

This tutorial will focus on three phases:

  1. Review of the state of the art on computer models for analysis (Dyke, 2010), its limitations and related methodological issues
  2. Hands-on overview of SIDE and Tatiana and their functionalities, using analyses from the authors’ previous work
  3. Free tutored time using the tools to perform analyses, based as much as possible on participants’ own research questions on their own data and/or provided sample corpora.

Throughout the tutorial, we will draw on our experience in the multivocality workshops to help participants relate their tool usage to their research practices and methodologies, in order to better understand the limitations of the state of the art and the analysis opportunities which it provides to the CSCL field.

Based on desires expressed by participants, we may also explore how existing tools might be extended to meet new analytic needs, but do not expect this to be a key focus.

Participants

This tutorial is aimed at researchers from the CSCL community in general, most specifically those who are interested in the detailed analysis of online or face-to-face conversations. Prospective participants should submit a statement of interest to that briefly (2-5 paragraphs) presents their prior and current work, typical methodologies and their expectations for the tutorial.

As we would like to support researchers who are looking for tools to solve their analytic problems by giving them the opportunity to examine their data in SIDE and Tatiana, prospective participants may optionally submit a dataset for which we will provide assistance in import to Tatiana and SIDE prior to the workshop. Such data sets should:

  • Include timestamped transcripts, either of online discussion or video recordings.
  • Be in a machine readable format (Excel, XML, CSV, etc.)
  • Optionally be partially analyzed (codes, annotations, etc.)
  • Include as complete a description as possible of the research and pedagogical context the dataset is taken from.

If you wish to submit a dataset, please describe it in your statement of interest.

Accepted participants will receive a set of dataset descriptions (the accepted datasets plus a few sample datasets provided by the organisers). Prior to the event, they will be expected to choose one of these datasets and a research question related to it, which they will plan to explore during the third part of the tutorial.

Participation is limited to 15 participants. We will accept up to 3 datasets. Priority will be given to young researchers and researchers whose typical methodologies are deemed best suited to computer-assisted analysis.

Dates

These dates are not hard deadlines. We would suggest, however that you at least send a short 1-line email by the requested dates to notify us of your interest. While late-comers may be accepted if there is sufficient room, we may not be able to provide the same level of support and adaptation of the tutorial to individual researcher's needs if we do not have enough preparation time.

February 21, 2011 5 paragraph statement of interest due (priority consideration)
March 15, 2011 Notification of participation in the tutorial
April 15, 2011 Data sent to tutorial organizers
June 1, 2011 Dataset descriptions sent out to participants

All correspondance should be sent to Gregory Dyke:

    Organisers

    Gregory Dyke is a post-doctoral researcher at the Language Technologies Institute at Carnegie Mellon University. He completed his PhD at the Ecole des Mines of Saint Etienne under the supervision of Jean-Jacques Girardot and Kristine Lund, with a focus on the means to provide computer support for the analysis of CSCL data. This work resulted in the creation of Tatiana (Trace Analysis Tool for Interaction Analysts; Dyke, Lund, & Girardot, 2009), used by over 15 research projects in 10 countries, and for which he was joint-recipient of the Student Best Paper award at CSCL 2009. Tatiana provides the means to create and re-use a variety of analytic representations. He continues to work on computational methods for the analysis of human interactions with a focus on how time can best be taken into account in learning activities.

    Carolyn Rosé is an Assistant Professor in the school of Computer Science at Carnegie Mellon University, with a joint appointment between the Language Technologies Institute and the Human-Computer Interaction Institute. She serves as an Executive Committee member of the Pittsburgh Science of Learning Center and co-thrust leader of its Social and Communicative Factors in Learning thrust. Her research team has worked on analyses of collaborative learning interactions in both text and speech, pairs and small groups, middle school through college aged, in the US and abroad, in a variety of subjects such as math, science, engineering and psychology. Her team has produced tools for supporting automatic collaborative learning process analyses, including Taghelper tools, which has a user base of over 1,200 users from 69 countries. Her teaching activities include courses on machine-learning in practice, CSCL, and research design and writing.

References

  • 1 http://code.google.com/p/tatiana (Dyke et al., 2009)
  • 2 http://www.cs.cmu.edu/~cprose/SIDE.html (Kane, Chaudhuri, Joshi, & Rose, 2008)
  • Dyke, G., Lund, K., Jeong, H., Medina, R., Suthers, D. D., van Aalst, J., Chen, W., Looi, C.-K. (submitted) Technological affordances for productive multivocality in analysis. In H. Spada, G. Stahl, N. Miyake, N. Law & K. M. Cheng (Eds.), Connecting Computer-Supported Collaborative Learning to Policy and Practice: Proceedings of the 9th International Conference on Computer-Supported Collaborative Learning (CSCL 2011) Hong Kong: University of Hong Kong.
  • Kane, M., Chaudhuri, S., Joshi, M., Rose, C. (2008): SIDE: The Summarization Integrated Development Environment , Proceedings of the Association for Computational Linguistics, Demo Abstracts.
  • Rosé, C., Wang, Y.-C., Cui, Y., Arguello, J., Stegmann, K., Weinberger, A. & Fischer, F. (2008) Analyzing collaborative learning processes automatically: Exploiting the advances of computational linguistics in CSCL. ijcscl 3 (3).
  • Suthers, D. D., Lund, K., Rosé, C., Dyke, G., Law, N., Teplovs, C., et al. (submitted). Towards productive multivocality in the analysis of collaborative learning. In H. Spada, G. Stahl, N. Miyake, N. Law & K. M. Cheng (Eds.), Connecting Computer-Supported Collaborative Learning to Policy and Practice: Proceedings of the 9th International Conference on Computer-Supported Collaborative Learning (CSCL 2011) Hong Kong: University of Hong Kong.
Last modified on January 31, 2011