Using Selenium/Chrome to Analyze Media: Evolution of Our Testing Framework

Our journey to build a robust media analysis pipeline using Selenium and Chrome has been one of continuous improvement. This post details the significant enhancements we’ve made to our testing framework, particularly focusing on handling complex media content in MorphoSource.

Key Improvements in Browser Setup

Our browser configuration has evolved to handle modern web challenges. Here’s our enhanced setup:

def setup_driver():
    chrome_options = Options()
    chrome_options.add_argument('--headless=new')  # Modern headless mode
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-dev-shm-usage')
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--disable-software-rasterizer')
    chrome_options.add_argument('--window-size=1920,1080')
    
    service = Service(ChromeDriverManager().install())
    driver = webdriver.Chrome(service=service, options=chrome_options)

Notable improvements include:

Using the new headless mode (--headless=new)
Adding software rasterizer disable option
Implementing ChromeDriverManager for automatic driver management

Enhanced Media Orientation Control

We’ve implemented sophisticated controls for managing media orientation:

def set_orientation(driver, actions, orientation):
    try:
        orientation_label = driver.find_element(By.XPATH, 
            "//label[text()='Orientation']")
        parent_div = orientation_label.find_element(By.XPATH, 
            "./ancestor::div[contains(@class, 'grid gap-4')]")
        combobox = parent_div.find_element(By.CSS_SELECTOR, 
            'button[role="combobox"]')

Key features:

Robust element location using XPath and CSS selectors
Parent-child relationship navigation
Multiple fallback mechanisms for interaction

Improved Error Handling

We’ve implemented comprehensive error handling throughout the pipeline:

def take_screenshot(url):
    file_id = extract_id_from_url(url)
    try:
        # ... screenshot logic ...
    except Exception as e:
        print(f"Error processing URL: {str(e)}")
        if driver:
            try:
                error_file = f"error_{file_id}.png"
                driver.save_screenshot(error_file)
                print(f"Error screenshot saved as {error_file}")
            except:
                print("Could not save error screenshot")
        return False

Improvements include:

Error screenshots for debugging
Detailed error logging
Graceful failure handling

Batch Processing Capabilities

We’ve added the ability to process multiple URLs from a file:

def process_urls_from_file(input_file):
    urls = re.findall(r'https://www\.morphosource\.org/concern/media/\d+', 
                     content)
    
    for i, url in enumerate(urls, 1):
        print(f"\nProcessing URL {i}/{len(urls)}: {url}")
        if take_screenshot(url):
            successful_screenshots += 1

Features:

Regex-based URL extraction
Progress tracking
Success rate monitoring

Intelligent Wait Times

We’ve implemented strategic waiting mechanisms:

def take_screenshot(url):
    driver.set_page_load_timeout(30)
    driver.implicitly_wait(10)
    
    # Wait for complex media loading
    time.sleep(480)  # Allow time for heavy 3D content
    
    # Wait for UI interactions
    wait = WebDriverWait(driver, 10)
    uv_iframe = wait.until(
        EC.presence_of_element_located((By.CSS_SELECTOR, "iframe#uv-iframe"))
    )

Improvements:

Explicit waiting for heavy media content
Smart timeouts for different operations
Combined implicit and explicit waits

Advanced Interaction Handling

We’ve enhanced our UI interaction capabilities:

def set_orientation(driver, actions, orientation):
    # Attempt standard click
    actions.move_to_element(combobox)
    actions.click()
    actions.perform()
    
    # Fallback mechanism
    if not is_expanded:
        actions.move_to_element(combobox)
        actions.click_and_hold()
        actions.pause(1)
        actions.release()
        actions.perform()

New features:

Multiple interaction methods
Verification of interaction success
Fallback strategies

Output Management

We’ve improved our output handling and naming conventions:

def take_screenshot(url):
    screenshot_file = f"{file_id}_{orientation.replace(' ', '_')
        .replace('(', '')
        .replace(')', '')
        .replace('+', 'plus')
        .replace('°', '')}.png"

Enhancements:

Structured file naming
Special character handling
Orientation-specific naming

Looking Forward

Future improvements we’re considering:

Parallel processing for multiple URLs
Enhanced retry mechanisms
More sophisticated progress reporting
Integration with cloud storage services

Conclusion

Our Selenium/Chrome testing framework has evolved into a robust solution for media analysis. Through careful attention to error handling, timing, and interaction management, we’ve created a reliable system for automated media testing and screenshot capture. These improvements have significantly enhanced our ability to analyze and validate media content across the MorphoSource platform.

Conclusion

The evolution of our Selenium testing framework mirrors the broader journey of microservices development. By applying principles of resilience, observability, and maintainability, we’ve created a robust testing solution that supports our growing platform.

As we continue to enhance MorphoSource, these lessons in microservices architecture and test automation will guide our future development, ensuring we maintain high quality and reliability in our service delivery.