VLAN manipulation/translation on Juniper MX series routers

The MX series routers are truly excellent. As well as being used for routing, they can also be used for switching. Switches do routing, so why not the other way around… right?

A basic switching setup on an MX is a VLAN bridge. Take the following config. It is akin to setting 2 802.1Q trunk ports on a switch, both in VLAN 66:

  1. interfaces {
  2. ge-1/0/4 {
  3. flexible-vlan-tagging;
  4. encapsulation flexible-ethernet-services;
  5. unit 66 {
  6. encapsulation vlan-bridge;
  7. vlan-id 66;
  8. }
  9. }
  10. ge-1/1/7 {
  11. flexible-vlan-tagging;
  12. encapsulation flexible-ethernet-services;
  13. unit 66 {
  14. encapsulation vlan-bridge;
  15. vlan-id 66;
  16. }
  17. }
  18. }
  19. bridge-domains {
  20. my-bridge {
  21. domain-type bridge;
  22. vlan-id 66;
  23. interface ge-1/0/4.66;
  24. interface ge-1/1/7.66;
  25. }
  26. }

Here we have 2 interfaces, both with a unit that matches traffic tagged for VLAN 66. The bridge domain sends layer 2 traffic between these two interfaces, as if it were a switch.

Bridge domains on the MX inherently do what is called VLAN Normalization/Translation. When a packet enters an interface, its VLAN is normalized to that of the bridge domain. When a packet leaves an interface, its VLAN is normalized to that of the exiting interface. The above example has the same VLAN on both interfaces and the bridge domain, so let’s look at a different example:

  1. interfaces {
  2. ge-1/0/4 {
  3. flexible-vlan-tagging;
  4. encapsulation flexible-ethernet-services;
  5. unit 55 {
  6. encapsulation vlan-bridge;
  7. vlan-id 55;
  8. }
  9. }
  10. ge-1/1/7 {
  11. flexible-vlan-tagging;
  12. encapsulation flexible-ethernet-services;
  13. unit 66 {
  14. encapsulation vlan-bridge;
  15. vlan-id 66;
  16. }
  17. }
  18. }
  19. bridge-domains {
  20. my-bridge {
  21. domain-type bridge;
  22. vlan-id 43;
  23. interface ge-1/0/4.55;
  24. interface ge-1/1/7.66;
  25. }
  26. }

In this example, when a packet tagged VLAN 55 enters on ge-1/0/4, its VLAN tag is swapped for 43 because that is the VLAN ID of the bridge. When the same packet leaves on ge-1/1/7, its VLAN tag is swapped again for 66 because that is the VLAN ID of the exiting interface.

As you can probably see, this is a silly example… the VLAN 43 is pointless here but it gives you an idea of what happens when packets traverse the bridge.

You can see this behavior when you do “show interfaces”:

  1. mx10> show interfaces ge-1/0/4.55
  2. Logical interface ge-1/0/4.55 (Index 346) (SNMP ifIndex 612)
  3. Flags: SNMP-Traps 0x0 VLAN-Tag [ 0x8100.55 ] In(swap .43) Out(swap .55) Encapsulation: VLAN-Bridge
  4.  
  5. mx10> show interfaces ge-1/1/7.66
  6. Logical interface ge-1/1/7.66 (Index 344) (SNMP ifIndex 611)
  7. Flags: SNMP-Traps 0x0 VLAN-Tag [ 0x8100.66 ] In(swap .43) Out(swap .66) Encapsulation: VLAN-Bridge

The key bit is the Flags line. You can see that input packets are swapped with VLAN 43 and output packets are swapped with the VLAN of the interface.

The above example shows you VLAN translation. In reality, you’d probably set the vlan-id of the bridge-domain to one of your interface VLANs or perhaps “none”. In the case of setting it to one of your VLANs, you’d see no work done on packets coming into/leaving the interface with the same VLAN ID and a swap for both In/Out on the other interface. In the case of “none”, the VLAN tag would be removed as a packet comes in and a new tag would be added when the packet leaves. This removal is called a “pop” and the addition as “push”.

Another good example of the usage of this is to convert a double tagged (Q-in-Q) packet to a single tagged one. You might have a provider who is using Q-in-Q and you want to remove their VLAN (the S-VLAN) and use only your VLAN (the C-VLAN) on devices which are on the “other side” of your MX. Here’s an example of that:

  1. interfaces {
  2. ge-1/0/0 {
  3. flexible-vlan-tagging;
  4. encapsulation flexible-ethernet-services;
  5. unit 601 {
  6. encapsulation vlan-bridge;
  7. vlan-id 601;
  8. }
  9. }
  10. ge-1/1/3 {
  11. flexible-vlan-tagging;
  12. encapsulation flexible-ethernet-services;
  13. unit 601 {
  14. encapsulation vlan-bridge;
  15. vlan-tags outer 255 inner 601;
  16. }
  17. }
  18. }
  19. bridge-domain {
  20. vlan-601 {
  21. domain-type bridge;
  22. vlan-id 601;
  23. interface ge-1/0/0.601;
  24. interface ge-1/1/3.601;
  25. }
  26. }

So here, we have a single tagged and a double tagged interface. The assumption is that the single tagged interface is facing our equipment (e.g. an SRX) and the double tagged interface is facing the provider. When double tagged packets from the provider enter on ge-1/1/3, the outer VLAN is removed (pop) because the vlan-id of the bridge domain is 601 – the same as the inner vlan-id. When packets leave on ge-1/0/0, nothing is done as the VLAN is already 601.

In the opposite direction, single tagged packets enter on ge-1/0/0. Nothing is done to them because the vlan-id already matches that of the bridge. When packets leave on ge-1/1/3, the VLAN 255 is added (push). Here’s what the “show interfaces” says:

  1. mx10> show interfaces ge-1/0/0.601
  2. Logical interface ge-1/0/0.601 (Index 363) (SNMP ifIndex 590)
  3. Flags: SNMP-Traps 0x0 VLAN-Tag [ 0x8100.601 ] Encapsulation: VLAN-Bridge
  4.  
  5. mx10> show interfaces ge-1/1/3.601
  6. Logical interface ge-1/1/3.601 (Index 374) (SNMP ifIndex 589)
  7. Flags: SNMP-Traps 0x0 VLAN-Tag [ 0x8100.255 0x8100.601 ] In(pop) Out(push 0x8100.255) Encapsulation: VLAN-Bridge

See the push/pop on the double tagged interface and nothing on the single tagged interface.

If you set no vlan-id on the bridge-domain… you are then allowed to define input-vlan-map and output-vlan-map on the logical interfaces. This allows you to customize exactly what happens in ingress and egress packets, rather than taking the default behavior as explained above.

Chassis clustering a Juniper SRX firewall via a switch

Intro

It is recommended that clustered SRX devices are directly connected. To do this, you need to run 2 cables, one for the control plane and the other for the fabric. This is sometimes not easy (or cheap) in a data centre environment where the firewalls are in different racks – especially given that the control link must be copper, on most SRX devices, and is thus limited to 100m.

You can also cluster SRX devices by connecting the links into a switch. A common use for this would be to cluster 2 firewalls, each in different racks, via your core switching chassis cluster.

tldr; (sorry, it’s still quite long)

You’ll need to read the chassis cluster guide. Here’s the one for the SRX 300, 320, 340, 345, 550 and 1150. On pages 44 and 45 you will see diagrams of how the devices must be connected. Most SRX devices enforce the use of a particular port for the control plane. When clustered, the control port will be renamed to something like fxp1. The fabric can usually be any port you like.

Connect the control and fabric ports of each SRX device into your switch.

The switch ports need to be configured like so:

  • MTU 8980
  • Access port (no VLAN tagging)
  • A unique VLAN – control and fabric need their own VLAN (e.g. control = 701, fabric = 702). The VLAN should have only 2 ports in it (e.g. firewall 1 control port and firewall 2 control port)
  • IGMP snooping turned off
  • CDP/LLDP/other junk turned off

You must first delete the configuration for the control interface, on each firewall, if it exists. If you don’t do this, you’ll be stuck in a strange state when the firewalls come back up as they will error when loading the configuration. If you can, you may as well delete all interfaces:

  1. edit
  2. delete interfaces
  3. commit

Log into each firewall via its console port. On firewall 1:

  1. set chassis cluster cluster-id 1 node 0 reboot

On firewall 2:

  1. set chassis cluster cluster-id 1 node 1 reboot

Wait for the firewalls to finish rebooting. Check the status of the cluster like so:

  1. show chassis cluster status

One node should be primary and the other secondary. Make sure you wait for all the “Monitor-failures” to clear before continuing.

Now you can work solely on the primary node… so you can log out of the secondary. You’ll need to assign the physical ports that you connected up for the fabric to the interfaces fab0 and fab1. Note that the ports on the secondary device will have been re-numbered. That is to say the on-board ports will no longer be ge-0/0/something, but will rather be something like ge-5/0/something. The number prefix depends on the model of SRX and, specifically, how many PIM slots it has. You’ll need to read the chassis clustering guide to work out what to do for your model.

  1. set interfaces fab0 fabric-options member-interfaces ge-0/0/2
  2. set interfaces fab1 fabric-options member-interfaces ge-5/0/2
  3. commit

Check the full cluster status:

  1. run show chassis cluster interfaces

You should see both control and fabric as Up.

Config for Juniper EX Series Switches

The below is the config for an EX series virtual chassis (VC). It’s simpler than if you had unclustered switches as you don’t need to worry about carrying VLANs between switches. If you don’t have a VC, you’ll need to do a little more on top of this.

  1. vlans {
  2. VLAN701 {
  3. description fw_control_link;
  4. vlan-id 701;
  5. }
  6. VLAN702 {
  7. description fw_fabric_link;
  8. vlan-id 702;
  9. }
  10. }
  11. protocols {
  12. igmp-snooping {
  13. vlan VLAN701 {
  14. disable;
  15. }
  16. vlan VLAN702 {
  17. disable;
  18. }
  19. }
  20. lldp {
  21. interface ge-0/0/17.0 {
  22. disable;
  23. }
  24. interface ge-4/0/17.0 {
  25. disable;
  26. }
  27. interface ge-0/0/18.0 {
  28. disable;
  29. }
  30. interface ge-4/0/18.0 {
  31. disable;
  32. }
  33. }
  34. }
  35. interfaces {
  36. ge-0/0/17 {
  37. description FW-01_Control_Link;
  38. mtu 8980;
  39. unit 0 {
  40. family ethernet-switching {
  41. port-mode access;
  42. vlan {
  43. members VLAN701;
  44. }
  45. }
  46. }
  47. }
  48. ge-0/0/18 {
  49. description FW-01_Fabric_Link;
  50. mtu 8980;
  51. unit 0 {
  52. family ethernet-switching {
  53. port-mode access;
  54. vlan {
  55. members VLAN702;
  56. }
  57. }
  58. }
  59. }
  60. ge-4/0/17 {
  61. description FW-02_Control_Link;
  62. mtu 8980;
  63. unit 0 {
  64. family ethernet-switching {
  65. port-mode access;
  66. vlan {
  67. members VLAN701;
  68. }
  69. }
  70. }
  71. }
  72. ge-4/0/18 {
  73. description FW-02_Fabric_Link;
  74. mtu 8980;
  75. unit 0 {
  76. family ethernet-switching {
  77. port-mode access;
  78. vlan {
  79. members VLAN702;
  80. }
  81. }
  82. }
  83. }
  84. }

Debugging

Check the status of nodes in the cluster:

  1. show chassis cluster status

Find out which interfaces are in the cluster:

  1. show chassis cluster interfaces

This will show you if data is being sent/received over the control and fabric links:

  1. show chassis cluster statistics

Check if the arp table has entries for the other firewall (i.e. they have layer 2 connectivity):

  1. show arp | match fxp

Configuring Node Specific Things

When you change the configuration on one node, it will be automatically applied on the other nodes. However, you will want some settings that are specific to a single node – for example hostname and management IP. You can set these settings into groups <nodename>, e.g. groups node0.

You’ll also need to set apply-groups “${node}” in order to have the node specific configuration apply to the right nodes.

Example config below for configuring hostname and management IP:

  1. groups {
  2. node0 {
  3. system {
  4. host-name fw-01;
  5. }
  6. interfaces {
  7. fxp0 {
  8. unit 0 {
  9. family inet {
  10. address 192.168.1.1/24;
  11. }
  12. }
  13. }
  14. }
  15. }
  16. node1 {
  17. system {
  18. host-name fw-02;
  19. }
  20. interfaces {
  21. fxp0 {
  22. unit 0 {
  23. family inet {
  24. address 192.168.1.2/24;
  25. }
  26. }
  27. }
  28. }
  29. }
  30. }
  31. apply-groups "${node}";

[SOLVED] pfSense – pfsync_undefer_state: unable to find deferred state

This pfSense bug has been present since 2.2 and is still present at the time of writing this. It occurs when using limiters on a pair of pfSense servers set up in a HA cluster. When connections are received which match the firewall rule you have put in place to force the traffic into the relevant limiter, those states fail to correctly sync and pfSense will experience extremely high load values. You will see the message pfsync_undefer_state: unable to find deferred state on the console and in logs. pfSense may also become inaccessible over HTTP and SSH.

The “fix” is to disable all synchronization of state. To do this, go to System -> High Avail. Sync and untick the Synchronize states box. You will need to do this on all nodes in the HA cluster.

It should be noted that this isn’t a particularly good fix. If states do not synchronize between nodes in the cluster, TCP connections will be (not very cleanly) terminated and must be re-established. Some applications may not respond too well to this.

 

Configuring the permissions of a Samba share

Configuring the permissions of a Samba share

Samba share permissions can be a bit fiddly. The user and group IDs which own the file on the Samba server will propagate over to the client machines, which will enforce local permissions themselves.

Ideally, you want to have the same users/groups on all machines. This isn’t always practical but could be achieved with a config management tool such as Puppet or SaltStack, or indeed by backing your local users from an LDAP server.

If this is not possible, the following is suggested:

On your Samba server

  • Create a group which will own all the files, for example samba-users
  • Add all of your Samba users to the group you created – e.g. adduser downloader samba-users
  • Chown all of your shared files and folders to root:samba-users
  • Chmod all of your shared files to 660
  • Chmod all of your shared folders to 770
  • Add the below to the config for your share to enforce the above for all new files and folders:
  1. create mask = 0664
  2. force create mode = 0664
  3. directory mask = 0775
  4. force directory mode = 0775
  5. force group = samba-users

On your client server(s)

  • Create a group which will be able to access all the files on the share, for example samba-users
  • Obtain the group ID (GID) from /etc/group for this group
  • In the mount options of the share (in /etc/fstab) add the uid 0 (root) as in the below example
  • In the mount options of the share (in /etc/fstab) add the gid as in the below example where the GID is 1002
  1. //192.168.1.123/downloads /mnt/downloads cifs username=downloader,password=foobarbaz,iocharset=utf8,uid=0,gid=1002 0 0

If you u(n)mount and remount the share you will see that all the files are now owned by the group you specified in fstab.

Disclaimer

There might be a better way… feel free to comment if you know what it is.

SOLVED – “mount error(13): Permission denied” when doing cifs mount on LXC container (Proxmox)

When trying to do a command like this on a system running inside an LXC container on Proxmox:

  1. mount -t cifs '\\172.55.0.60\downloads' -o username=myuser,password=mypass /mnt/downloads

 

Linux threw the error mount error(13): Permission denied. `tcpdump` showed that no traffic was leaving the container and `strace` didn’t throw up a lot of useful info.

dmesg said this:

  1. [171150.670602] audit: type=1400 audit(1471291773.083:167): apparmor="DENIED" operation="mount" info="failed type match" error=-13 profile="lxc-container-default" name="/run/shm/" pid=59433 comm="mount" flags="rw, nosuid, nodev, noexec, remount, relatime"

This reddit post finally yielded the answer. You need to edit /etc/apparmor.d/lxc/lxc-default and below the last deny mount line, add this:

  1. allow mount fstype=cifs,

The final config file will look something like this:

  1. # Do not load this file. Rather, load /etc/apparmor.d/lxc-containers, which
  2. # will source all profiles under /etc/apparmor.d/lxc
  3.  
  4. profile lxc-container-default flags=(attach_disconnected,mediate_deleted) {
  5. #include <abstractions/lxc/container-base>
  6.  
  7. # the container may never be allowed to mount devpts. If it does, it
  8. # will remount the host's devpts. We could allow it to do it with
  9. # the newinstance option (but, right now, we don't).
  10. deny mount fstype=devpts,
  11. allow mount fstype=cifs,
  12. }

Now restart apparmour:

  1. systemctl restart apparmor.service

Shut down your VM and start it again.

Your mount command might well work now. If not, check logs again to be sure it’s not a secondary problem (e.g. incorrect hashing algorithm).