Growth Hackers in SNU
Business Data Professionals of SNU
Overview
Through Growth Hackers in SNU, I worked on industry-facing data projects with partner companies and learned how messy business questions have to be translated into workable modeling and evaluation problems. It was an early stage where I saw applied ML not as isolated model-building, but as a process of framing the right question, understanding the data, and deciding what kind of result would actually be useful.
I also served as the 7th education team leader, redesigning the internal curriculum so newer members could learn data analysis and modeling in a more practical and structured way.
Selected Projects
RecSys Modeling with Educast
Oct 2020 - Dec 2020
This project explored recommendation for education-oriented content by comparing multiple modeling directions. I implemented deep generative recommendation models in PyTorch and also tested content-based methods built on embeddings and clustering.
As the project progressed, the central challenge became less about model choice itself and more about evaluation: what metric actually reflects a good recommendation in a given scenario? I led that analysis and argued that evaluation should vary with the recommendation context rather than be treated as one-size-fits-all.
The main value of the project was in clarifying how modeling and evaluation choices should be judged in an applied recommendation setting.
Automating Delivery Area Assignment and Cargo Demand Prediction with Timfresh
Jul 2020 - Aug 2020
This project involved two distinct technical problems. On the operational side, Timfresh was assigning delivery areas manually, which meant drivers were paid similarly while carrying very different delivery burdens in practice. As co-PM on the technical side, I redefined delivery areas using actual route distance from a map API rather than administrative boundaries or naive map distance, and then built a clustering-based assignment scheme that also considered additional hierarchical structure.
By comparing the resulting variance against the existing manual assignment, I showed that the current data was already enough to support a more balanced and more automated planning process, even if the system itself was not directly deployed.
A separate problem was cargo demand forecasting. The company expected that a deep-learning approach might solve it, but after comparing DL models with traditional forecasting methods such as ARIMA and SARIMA, I found that simple seasonal models still performed better.
With very limited feature columns and not enough accumulated data, the right recommendation was not to force a more complex model, but to first improve the data foundation. That project became an important lesson in judging when the real bottleneck is the model and when it is the data.
RecSys Modeling with Mathpresso
Apr 2020 - Jun 2020
This project began with basic EDA, and the main issue emerged before modeling really did: once filtering was applied, the remaining usable data was too limited to support a strong recommender.
I still built a predictive recommendation model in TensorFlow and analyzed the learned embeddings, but the resulting representations were heavily biased toward a very narrow set of data features.
The most important outcome was not a better model, but the conclusion that the data itself needed to be improved before more sophisticated ML would become meaningfully useful. Communicating that insight back to the company was one of the most valuable parts of the project.
What It Taught Me
The most important lesson from this period was that ML still has to begin with the data. Before model choice or sophistication matters, you have to understand what signal the data actually contains, what disappears during filtering, and whether the remaining structure is strong enough to support a meaningful decision. That perspective became a lasting part of how I think about applied ML: better modeling only matters when the data can truly support it.