We use cookies to enhance your experience on our website. Please read and confirm your agreement to our Privacy Policy and Terms and Conditions before continue to browse our website.
We are seeking a skilled and detail-oriented engineer to join our dynamic team. In this role, you will be responsible for the development, maintenance, and optimization of real-time systems that enable data collection, processing, analysis, visualization, and alerting. You will work closely with data engineers, software developers, and data analysts to ensure the smooth operation of our data acquisition infrastructure. The ideal candidate has a strong background in web scraping, data extraction, and system monitoring, along with a passion for solving complex technical challenges.
Responsibilities
Design, develop and maintain web crawlers and data extraction pipelines to collect data from various online sources.
Design and implement infrastructure components for high availability, scalability, and efficiency.
Troubleshoot and resolve issues related to crawler operations and data quality problems.
Collaborate with teams to understand data requirements and implement solutions to meet business needs.
Develop and maintain documentation for crawler systems, including technical specifications, operational procedures, and troubleshooting guides.
Stay up to date with the latest trends and technologies in web scraping, data extraction, and distributed systems.
Automate repetitive tasks and improve system reliability through scripting and tool development.
Qualifications
Bachelor’s degree in computer science, Information Technology, or a related field.
Proven experience in web scraping, data extraction, and crawler development using tools such as Scrapy, BeautifulSoup, Selenium, or similar frameworks.
Strong programming skills in Python.
Familiarity with web technologies (HTML, CSS, JavaScript, AJAX) and RESTful APIs.
Understanding of network protocols, IP management, and proxy services.
Strong problem-solving skills and the ability to work independently or as part of a team.
Excellent communication skills and the ability to document technical processes clearly.
Preferred Qualifications
Experience with machine learning or natural language processing (NLP) techniques for data extraction.
Experience with distributed systems, cloud platforms (e.g., AWS, Azure, GCP), and containerization technologies (e.g., Docker, Kubernetes).
Knowledge of database systems (SQL, NoSQL) and data storage solutions.
Knowledge of data privacy regulations and ethical considerations in web scraping.
Familiarity with version control systems (e.g., Git) and CI/CD pipelines.
Experience with monitoring tools (e.g., Prometheus, Grafana) and log management systems (e.g., ELK Stack).
Location of work: Hong Kong Science Park
Benefits
Competitive salary
Annual leave and group medical insurance
Good team culture
Assistance to apply for an IANG VISA
All applications applied through our system will be delivered directly to the advertiser and privacy of personal data of the applicant will be ensured with security.