Project MOSAIC: tools for teaching mathematical modeling and statistics

Via a commenter over at Small Pond Science, just discovered Project MOSAIC, an NSF-funded project to help teach introductory undergraduate mathematical modelling, statistics, and computation. Among other things, they’ve developed a very handy-looking R package called mosaic, which simplifies the use of R for basic statistical and modeling task, and alters the output in a way designed to be friendly and people new to both statistics and to R.

I’m very interested in this as I’m currently revising our intro biostats course. Have you used the mosaic package or other Project MOSAIC resources in your own teaching? If so, please chime in with advice in the comments, or drop me a line (

8 thoughts on “Project MOSAIC: tools for teaching mathematical modeling and statistics

  1. Hi Jeremy there are a number of menu-based wrappers to R including e.g. R commander and R deducer (just google “R gui”). I would definitely look at those as well. Although, personally if it were me teaching an undergraduate class I would still stick with something like JMP due to the great ease of use (and also JMP actually has more analyses available via point and click then interfaces on R do – you still have to drop into the command line to use things much more advanced than t-test/ANOVA/regression in the R GUIs last i looked – although obviously this varies by interface).

    I do teach a graduate stats course using R although it requires a separate 2 hour lab/week to cover the R-specific features (and it nearly drives some students to tears). But most students are grateful in the end for having been trained in what is clearly (and despite my wishes) the tool of choice in ecology that they know they’ll have to use (see Terry’s recent post). But for an undergrad class where I’m still more focused on the concepts I personally would stick with a real from-the-ground-up GUI tool.

    • Yes, I actually just plugged R Commander myself to Terry McGlynn (he wants to learn R for his own research). There’s R Studio too.

      The course I’m teaching already has a number of labs built around command line R, so we’ll probably stick with that, at least in the short term. I’ve already got my hands full making various other changes to the course, without rewriting the numerous existing labs I’m planning to keep. Indeed, we’ll probably be increasing the use of command line R (right now early labs in the course on descriptive stats and data exploration are in Excel; I want to switch them to R). And I think it’s good for the students to have to think about what they’re asking the software to do, which you have to do if you’re working in the command line. But I certainly am aware of the steep learning curve and intimidation factor for command line R. That’s part of why I’m interested in Mosaic–it’s still command line R (as far as I can tell; I’ve only barely glanced at it), but a more student friendly command line R. But at some point down the road it’s possible I or someone else might switch over to teaching the course using R with a gui.

      The course is committed to using R because it’s free, it’s increasingly the tool of choice in many fields (so students who go on to grad school will have an advantage in already having used it), and because students can download it and run it on their own computers too if they wish. No prospect of us switching to JMP or anything else.

      • That’s a big drawback about JMP in my course. Students pay thirty bucks to have it for six months during the course. Then when they need to apply what they learned, they don’t have a working copy.

        The only reason I wouldn’t use any command-line program when teaching is that, frankly, I don’t have the time to deal with so many student problems on an individualized basis. (If I had a TA, or a non-astronomical teaching load, then I’d take it on). I think grad students of all flavors need to emerge having learned R, somewhere, because it’s a marketable skill.

      • OK – misunderstood your context. Makes sense. I understand its not the way you’re heading, but for other readers, I’ll throw in my two cents that I think Excel with the data analysis package add-in (included with Excel but have to check a box to turn it on) is an underappreciated tool for undergrads. It has everything up through regression and its even possible (I’ve done it) to teach students how to do things like logistic regression in Excel. And most students will have access to Excel in their future. On the downside Excel certainly causes its own confusion that has to be taught through and it is not an appropriate tool for research statistics.

      • I agree. I actually do have some labs for other courses that I do in Excel, including one that makes use of Excel’s data analysis and optimization add-ons. It works fine, students in those courses are comfortable with Excel, and many of them have Excel (or can use one of the many computers on campus with Excel).

        In this intro biostats course, I want to shift the early labs out of Excel and into R in large part to get the students more practice with R. I’m told by the current instructors that the students don’t feel like they get enough practice with R before the lab final.

  2. I teach into a first-year subject that all science majors take called “Introduction to Modelling Natural Systems” which uses Excel, though we agonized about whether to use that or something like Splus. This subject introduces students to theoretical and statistical modelling as different kinds of uses of models (essentially, using models as a tool for deriving the consequences of a set of assumptions about how nature works, versus using models as a tool for characterizing patterns in empirical data). We went with Excel because the math and stats that we do is quite basic (e.g., iterating the discrete logistic with harvesting, fitting linear models using least-squares), and because students will invariably make extensive use of spreadsheets during their undergraduate careers, so having some familiarity with it is time well spent, even though I would be horrified if any of my students used it to analyse data as Honours or graduate students.

    The folks who teach the equivalent of intro biometrics at second year use SPlus. The advantage of this is that it is less scary for many students than R, but it should facilitate a transition to R down the track. This is especially true if you introduce the idea of scripting in small bits during the semester. I teach an “Ecological Detective”-type graduate course, which uses R, and there are now a lot of graduate workshops in “methods for X” offered periodically around campus that generally use R. I haven’t taught my graduate course since the transition to SPlus was made at undergraduate level, so I’ll be interested to see if my hypothesis about R being easier to pick up is borne out. Ask me next February (too late for your subject re-design, I’m afraid).

    You might look at a text by Dan Kaplan, which is connected to Project Mosaic somehow, if you are thinking about R or S-plus. It’s called “Statistical Modeling: A Fresh Approach.” Some things I like about this (from an admittedly cursory read-through) are conceptualizing stats as a modelling enterprise rather than algorithm-following, lots of emphasis on visualization of data, and the fact that the p-value isn’t introduced until chapter 15, about 2/3 of the way through the book. It, and null hypothesis testing, are introduced as useful tools in the statistical toolkit, and not the principal objective of all data analysis. Kaplan relies a lot on a sort of geometric approach to understanding models (e.g., relationships between observed/predicted/residuals are introduced using vectors). I don’t find this very intuitive personally, but others I’ve talked to think that many students would find this geometric approach a more natural one. That said, if your colleagues are demanding that your students finish your class with the biggest Rolodex of statistical tests possible, though (so that they can do complex multi-factorial, hierarchical designs that they don’t understand when they take their upper level ecology subjects), then this text probably won’t work for you.

    Good luck.


    • Thanks Sean, always interesting to hear how different folks deal with similar situations.

      I’ve actually settled on a text for the course–Whitlock & Schluter. Written at the right level, strong on concepts, accessible–at least I hope so!

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s