0 valutazioniIl 0% ha trovato utile questo documento (0 voti)
60 visualizzazioni14 pagine
Bukalapak is looking for researchers to work on various machine learning projects using their transaction and user behavior data. They provide an overview of 12 potential research topics, including developing datasets for product images and descriptions, building an annotation platform, expanding training data, automatic tagging of blog posts, identifying dropshippers in transaction graphs, product categorization with CNNs, developing a product ontology, collaborative filtering, learning to rank recommendations, and predicting next products viewed. They are open to additional topics and provide contacts for further discussion.
Bukalapak is looking for researchers to work on various machine learning projects using their transaction and user behavior data. They provide an overview of 12 potential research topics, including developing datasets for product images and descriptions, building an annotation platform, expanding training data, automatic tagging of blog posts, identifying dropshippers in transaction graphs, product categorization with CNNs, developing a product ontology, collaborative filtering, learning to rank recommendations, and predicting next products viewed. They are open to additional topics and provide contacts for further discussion.
Bukalapak is looking for researchers to work on various machine learning projects using their transaction and user behavior data. They provide an overview of 12 potential research topics, including developing datasets for product images and descriptions, building an annotation platform, expanding training data, automatic tagging of blog posts, identifying dropshippers in transaction graphs, product categorization with CNNs, developing a product ontology, collaborative filtering, learning to rank recommendations, and predicting next products viewed. They are open to additional topics and provide contacts for further discussion.
One of the major obstacles in advancing research in many fields is the availability of standard datasets. With millions of traffics daily and petabytes of data, Bukalapak can play the role as data provider for many research problems. However, standard datasets should have high quality, carefully collected and organized. Also, some of them may have to be annotated or labeled before others can use.
Dataset: search keywords, products summary and description, product images,
complaints and resolutions, product reviews, chats, transactions, etc Developing Bukalapak Annotation Platform Today, the majority of successful machine learning solutions are based on supervised learning. However, its reliance on labeled data has hindered many from making significant progress. We aim to build a platform for high quality data annotation (labeling) to help researchers and practitioners collect training data for supervised learning.
Training Set Expansion Some problems may only have few training data. Can we still train a model in such situation? One approach to this problem is by expanding the training set automatically from few highly confident seeds. We are looking into applying this method for machine learning problems in Bukalapak.
Dataset: product description, images, reviews
Automatic Tagging for Bukalapak Blogs Bukalapak has produced numerous contents on their blog. Typically each content have several tags which associate itself with related products or entities. With this research, we want to automatically infer relevant tags based on the information in the text.
Dataset: blog entries, product tags
Identifying Middlemen Using Transaction Graph Dropshipping is a common practice in many e-commerce sites. Although it is not prohibited,their existence does have impact to buyer’s experience. If we can build a graph that models transactions in Bukalapak, identifying dropshippers would be trivial. Having such graph would also be useful for many other purposes such as fraud prevention, consumer targeting, etc.
Dataset: transactions, users
Product Categorization using Deep CNN Models Many products in Bukalapak are located in incorrect categories. We want to reduce the number of wrong category selection by suggesting the correct category when a seller creates a product page. The product image will be used as the signal for detection or classification model.
Dataset: product images, categories, google open image dataset
Developing Bukalapak Products Ontology We believe that Semantic Web can provide a framework to enhance the discoverability of our products and services. It may also increase the reusability of product catalogs across the Web. This would not only benefit us as a company but also research community and ecommerce industry in large.
Dataset: products, GoodRelations ontology
Collaborative Filtering We have evidence that recommendation can be made more accurate through personalization. For many years, collaborative filtering has been used for building personalized recommendation. With millions of users and billions of recorded behavioral data, the challenge is how to make collaborative filtering methods scalable.
Dataset: users by transactions, search patterns.
Learning-to-Rank for Recommender System Although our current recommender system has already generated significant GMV increase, we are continuously working to improve it. Ranking is one of the most crucial and challenging problems to make better recommendations. With the massive amount of data, Learning-to-rank is a promising area that we want to explore.
Dataset: product detail views, user clicks
Product Discovery Sequence Prediction User experience when searching for products can be enhanced if we are able to predict what products that are highly likely to be viewed next. These next n products are determined based on the sequence of previous product views (similar to time series prediction). We can expand this capability to develop infinite product discovery queue.
Dataset: product detail views
Open Topics The list of topics that we mention here is not exhaustive. We are open if you want to propose a new topic or area that we have not considered. Both basic and applied research are welcome. Contacts @Telegram, Email Ibam: @iarief, ibam@bukalapak.com Pray: @prynglh, prayana.galih@bukalapak.com Diko: @mardiko, rahmatri.mardiko@bukalapak.com Zulva: @zulvacupa, zulva.fachrina@bukalapak.com Thank You