Background/Question/Methods With over 21 million records of plants from around the globe, citizen science platforms such as iNaturalist have immense potential to serve as a data resource for phenological research. Digital images collected in the field better preserve biological information such as color, habitat, and delicate structures when compared to pressed herbarium specimens commonly used in phenological studies. Geotagged images can be directly related to spatially explicit environmental data. However, these large datasets are time-consuming to annotate manually. Computer vision techniques have been successfully used to automate the annotation of herbarium specimens, but have not been applied to images collected by citizen scientists. These images pose additional challenges for classification such as cluttered backgrounds, inconsistent viewing angles, and varying levels of zoom. We adapted techniques from studies applying machine learning to herbarium specimens for use with iNaturalist images. We trained a neural network (ResNet 18) in Pytorch to annotate images of Alliaria petiolata (garlic mustard), using images from iNaturalist. We first trained a model to distinguish flowering and non-flowering plants. We subsequently began training a model to identify four stages of phenology: vegetative, budding, flowering, and fruiting. Both models were trained using 12,010 hand-annotated images of garlic mustard. Results/Conclusions The two-stage model we trained to identify flowering vs non-flowering individuals was able to correctly categorize images 92.6% of the time in a 906-image test dataset. This level of accuracy is comparable to manual annotation. The four-stage model was able to identify images from a 2,443-image test dataset correctly 87.5% of the time. The reduced accuracy from the two-stage classifier is largely due to confusion between flowering and fruiting images, especially for images showing both flowers and small fruits. Confusion of a flowering plant for a fruiting plant or vice versa accounted for 64% of incorrectly classified images. This is a level of accuracy comparable to relatively inexperienced scorers. This shows that machine learning techniques can be successfully employed by researchers to utilize large citizen science datasets. These techniques can be used to train similar models for phenological annotation of other plant species, even by researchers with little to no previous experience with machine learning. The ability to rapidly annotate phenology in citizen science images allows researchers to tap the full potential of citizen science data, and conduct studies with large sample sizes and with observations from a large geographic range.