Lesson 137 min

Limit, Filter and Remove Duplicates in n8n

Learn how to use Limit, Filter, and Remove Duplicates nodes in n8n to clean, filter, and manage your workflow data efficiently.

Limit, Filter and Remove Duplicates in n8n Tutorial

In this tutorial, you'll learn how to use three essential data manipulation nodes in n8n: Limit, Filter, and Remove Duplicates. These nodes help you clean, organize, and control the data flowing through your workflows, ensuring you only process the data you need.

Overview

When working with large datasets or API responses, you often need to:

  • Limit the number of items processed
  • Filter items based on specific conditions
  • Remove Duplicates to avoid processing the same data multiple times

These three nodes are fundamental building blocks for efficient workflow design and data management.


Limit Node

The Limit node restricts the number of items that pass through to the next node in your workflow.

When to Use Limit Node

  • Process only the first N items from a large dataset
  • Implement pagination manually
  • Limit API calls to stay within rate limits
  • Test workflows with a subset of data
  • Control processing costs

Configuration Options

Max Items: The maximum number of items to pass through

Keep: Choose which items to keep

  • First Items: Keep the first N items (default)
  • Last Items: Keep the last N items

Example Use Cases

Use Case 1: Process First 10 Records

Input: 100 customer records
Limit Node Configuration:
  - Max Items: 10
  - Keep: First Items
Output: First 10 customer records

Use Case 2: Get Latest 5 Orders

Input: 50 orders (sorted by date descending)
Limit Node Configuration:
  - Max Items: 5
  - Keep: First Items
Output: 5 most recent orders

Use Case 3: Testing with Sample Data

Input: 1000 items from database
Limit Node Configuration:
  - Max Items: 20
  - Keep: First Items
Output: 20 items for testing

Best Practices

  • Use before expensive operations (API calls, database writes)
  • Combine with Sort node to get top/bottom N items
  • Helpful for workflow development and testing
  • Consider using with Loop nodes for batch processing

Filter Node

The Filter node keeps or removes items based on conditions you define. It's like a bouncer that only lets certain data through.

When to Use Filter Node

  • Keep only items that meet specific criteria
  • Remove items with missing or invalid data
  • Filter by date ranges, status, or any field value
  • Implement business logic rules
  • Clean data before processing

Configuration Options

Conditions: Define rules to filter items

Combine Conditions:

  • AND: All conditions must be true
  • OR: At least one condition must be true

Keep Items Where: Conditions evaluate to true

Filter Operators

  • Equals: Field equals value
  • Not Equals: Field does not equal value
  • Contains: Field contains text
  • Does Not Contain: Field does not contain text
  • Starts With: Field starts with text
  • Ends With: Field ends with text
  • Exists: Field exists
  • Does Not Exist: Field does not exist
  • Greater Than: Numeric comparison
  • Less Than: Numeric comparison
  • Is Empty: Field is empty or null
  • Is Not Empty: Field has a value
  • Regex: Match using regular expressions

Example Use Cases

Use Case 1: Filter Active Users

Condition:
  Field: status
  Operation: Equals
  Value: active

Input: 
  [
    { "name": "John", "status": "active" },
    { "name": "Jane", "status": "inactive" },
    { "name": "Bob", "status": "active" }
  ]

Output:
  [
    { "name": "John", "status": "active" },
    { "name": "Bob", "status": "active" }
  ]

Use Case 2: Filter High-Value Orders

Condition:
  Field: order_total
  Operation: Greater Than
  Value: 500

Input: 
  [
    { "order_id": "001", "order_total": 750 },
    { "order_id": "002", "order_total": 200 },
    { "order_id": "003", "order_total": 1200 }
  ]

Output:
  [
    { "order_id": "001", "order_total": 750 },
    { "order_id": "003", "order_total": 1200 }
  ]

Use Case 3: Multiple Conditions (AND)

Combine: AND
Condition 1:
  Field: country
  Operation: Equals
  Value: USA

Condition 2:
  Field: age
  Operation: Greater Than
  Value: 18

Result: Only users from USA who are over 18

Use Case 4: Multiple Conditions (OR)

Combine: OR
Condition 1:
  Field: priority
  Operation: Equals
  Value: urgent

Condition 2:
  Field: status
  Operation: Equals
  Value: critical

Result: Items that are either urgent OR critical

Use Case 5: Filter Valid Emails

Condition:
  Field: email
  Operation: Regex
  Value: ^[^\s@]+@[^\s@]+\.[^\s@]+$

Result: Only items with valid email format

Filter Best Practices

  • Use specific conditions to reduce false positives
  • Test with sample data before production
  • Consider using expressions for complex logic
  • Chain multiple Filter nodes for readability
  • Use "Is Not Empty" to ensure required fields exist

Remove Duplicates Node

The Remove Duplicates node eliminates duplicate items from your data based on specified fields.

When to Use Remove Duplicates Node

  • Clean datasets with duplicate entries
  • Prevent duplicate API calls or database writes
  • Consolidate data from multiple sources
  • Ensure unique records in reports
  • Optimize workflow performance

Configuration Options

Compare: Choose which fields to compare for duplicates

  • All Fields: Items must match on all fields
  • Selected Fields: Specify which fields to compare

Fields to Compare: Select specific fields to determine duplicates

Example Use Cases

Use Case 1: Remove Duplicate Emails

Configuration:
  Compare: Selected Fields
  Fields: email

Input:
  [
    { "name": "John", "email": "john@example.com" },
    { "name": "Jane", "email": "jane@example.com" },
    { "name": "John Doe", "email": "john@example.com" }
  ]

Output:
  [
    { "name": "John", "email": "john@example.com" },
    { "name": "Jane", "email": "jane@example.com" }
  ]

Use Case 2: Remove Duplicate Orders by ID

Configuration:
  Compare: Selected Fields
  Fields: order_id

Input:
  [
    { "order_id": "001", "total": 100 },
    { "order_id": "002", "total": 200 },
    { "order_id": "001", "total": 100 }
  ]

Output:
  [
    { "order_id": "001", "total": 100 },
    { "order_id": "002", "total": 200 }
  ]

Use Case 3: Remove Exact Duplicates

Configuration:
  Compare: All Fields

Input:
  [
    { "name": "John", "age": 30 },
    { "name": "Jane", "age": 25 },
    { "name": "John", "age": 30 }
  ]

Output:
  [
    { "name": "John", "age": 30 },
    { "name": "Jane", "age": 25 }
  ]

Use Case 4: Multiple Field Comparison

Configuration:
  Compare: Selected Fields
  Fields: first_name, last_name

Result: Removes duplicates where both first and last name match

Remove Duplicates Best Practices

  • Place early in workflow to reduce processing
  • Use specific fields for better control
  • Consider case sensitivity in comparisons
  • Test with your actual data structure
  • Document which fields define uniqueness

Combining All Three Nodes

The real power comes from using these nodes together in your workflows.

Pattern 1: Filter → Remove Duplicates → Limit

1. Filter Node: Keep only active customers
2. Remove Duplicates: Remove duplicate emails
3. Limit Node: Take first 100 for processing

Result: 100 unique, active customers

Pattern 2: Remove Duplicates → Filter → Limit

1. Remove Duplicates: Remove duplicate orders
2. Filter Node: Keep orders > $500
3. Limit Node: Process top 50 orders

Result: Top 50 high-value unique orders

Pattern 3: Limit → Filter → Remove Duplicates

1. Limit Node: Take first 1000 items (performance)
2. Filter Node: Keep only valid records
3. Remove Duplicates: Remove any duplicates

Result: Unique valid records from first 1000 items

Practical Workflow Examples

Example 1: Clean Customer Contact List

Workflow:
1. Get contacts from database
2. Filter: Remove contacts without email
3. Filter: Keep only verified contacts
4. Remove Duplicates: By email field
5. Limit: Process 500 per batch
6. Send to email marketing platform

Example 2: Process Recent Orders

Workflow:
1. Fetch orders from API
2. Filter: Orders from last 30 days
3. Filter: Status = "completed"
4. Remove Duplicates: By order_id
5. Limit: Top 100 orders
6. Generate report

Example 3: Lead Qualification

Workflow:
1. Get leads from multiple sources
2. Remove Duplicates: By email and phone
3. Filter: Score > 70
4. Filter: Country in target list
5. Limit: 50 leads per day
6. Assign to sales team

Performance Tips

  1. Order Matters: Place nodes strategically

    • Remove duplicates early to reduce data volume
    • Filter before expensive operations
    • Limit early when testing
  2. Use Expressions: For complex filtering logic

    {{ $json.value > 100 && $json.status === "active" }}
    
  3. Batch Processing: Combine with Loop nodes

    • Process data in chunks
    • Prevent timeout errors
    • Better resource management
  4. Test Incrementally:

    • Start with small Limit values
    • Test filters with sample data
    • Verify duplicate removal logic

Troubleshooting

Filter Not Working as Expected?

  • Check field names match exactly (case-sensitive)
  • Verify data types (string vs number)
  • Test conditions with Execute Node
  • Use expressions for complex logic

Remove Duplicates Not Removing Items?

  • Verify field names are correct
  • Check for trailing spaces or case differences
  • Consider normalizing data before comparison
  • Test with "All Fields" to see behavior

Limit Node Returning Wrong Items?

  • Ensure data is sorted correctly
  • Check if you need "First" or "Last" items
  • Verify input data count

Key Benefits

  • Data Quality: Clean and validated data flowing through workflows
  • Performance: Process only necessary items, reducing costs and time
  • Reliability: Prevent duplicate processing and errors
  • Flexibility: Combine nodes for complex data manipulation
  • Control: Precise control over data flow and processing
  • Scalability: Handle large datasets efficiently

Common Use Cases Summary

| Node | Primary Use | When to Use | |------|-------------|-------------| | Limit | Control quantity | Testing, pagination, cost control | | Filter | Control quality | Data validation, business rules | | Remove Duplicates | Ensure uniqueness | Data consolidation, prevent duplicates |


Conclusion

The Limit, Filter, and Remove Duplicates nodes are essential tools for data management in n8n. By mastering these nodes, you can build more efficient, reliable, and cost-effective workflows that process exactly the data you need, exactly how you need it.

Start by using each node individually to understand their behavior, then combine them strategically to create powerful data processing pipelines. Remember: the order in which you use these nodes can significantly impact your workflow's performance and results.